Pipeline upgrades

The upgrade mechanism allows VisTrails to deal with workflow written with older versions of packages.

Each package has, in addition to its identifier, a version number. The package’s version number is written alongside the package and module names in serialized workflows; when it is loaded, the package will have a chance to replace this old module, else VisTrails will just bump the version number of the module. If the module has changed, and the connections that appear in the workflow don’t match the port names or signature that the current package declares, the pipeline will be invalid and the user will have to fix it manually.

A package can upgrade older modules using a function handle_module_upgrade_request() in its init module. That function takes the controller, module id and current pipeline, and performs actions to fix that module. Packages usually use UpgradeWorkflowHandler to do this, for example by passing a remap object to remap_module().

class vistrails.core.upgradeworkflow.UpgradeWorkflowHandler
static attempt_automatic_upgrade(controller, pipeline, module_id, function_remap=None, src_port_remap=None, dst_port_remap=None, annotation_remap=None, control_param_remap=None)

Automatically upgrade by simply replacing a module with the new version.

Attempts to automatically upgrade module by simply adding a new module with the current package version, and recreating all connections and functions. If any of the ports used are not available, raise an exception that will trigger the failure of the entire upgrade.

attempt_automatic_upgrade returns a list of actions if successful.

static remap_module(controller, module_id, pipeline, pkg_remap)

remap_module offers a method to shortcut the specification of upgrades. It is useful when just changing the names of ports or modules, but can also be used to add intermediate modules or change the format of parameters. It is usually called from handle_module_upgrade_request, and the first three arguments are passed from the arguments to that method.

pkg_remap specifies all of the changes and is of the format:

{<old_module_name>: [(<start_version>, <end_version>,
                      <new_module_klass> | <new_module_id> | None,
                      <remap_dictionary>)]}

where new_module_klass is the class and new_module_id is a string of the format:

<package_name>:[<namespace> | ]<module_name>

passing None keeps the original name, and remap_dictionary is {<remap_type>: <name_changes>} and <name_changes> is a map from <old_name> to <new_name> or <remap_function> The remap functions are passed the old object and the new module and should return a list of operations with elements of the form (‘add’, <obj>).

For example:

def outputName_remap(old_conn, new_module):
    ops = []
    ...
    return ops

pkg_remap = {'FileSink': [
                 (None, '1.5.1', FileSink, {
                      'dst_port_remap': {
                          'overrideFile': 'overwrite',
                          'outputName': outputName_remap},
                      'function_remap': {
                          'overrideFile': 'overwrite',
                          'outputName': 'outputPath'}}),
}

The controller triggers the upgrade of a version when it gets selected. Since each version has to be upgraded separately, the upgrade will be created (so the pipeline can be shown) but will not be flushed to the vistrail unless another change is made (based on that upgraded version) or the pipeline is executed.

Todo

There are currently some issues with how the controller keeps these unflushed upgrades.

#907

Todo

Upgrades should be triggered when necessary during diff, query, subworkflow update, parameter exploration.

#695, #1054, #1071, #1087