Last updated: October 20th, 2017
NDEx Sync is a command line utility that enables users to copy networks from one NDEx account (the source NDEx) to another (the target NDEx).
NDEx Sync 2.2 is fully compatible with the latest NDEx 2.2 release and no changes are required to use it.
NDEx Sync 2.2 can be downloaded from our FTP Server.
NDEx Sync 2.2 can ONLY copy networks between NDEx servers running NDEx v2.0 or higher.
The "route" parameter in the copy plans should use the new v2 endpoints! See the Copy Plans section below for more details.
NDEx Sync is open-source software available under a BSD license. The source code is hosted on GitHub at https://github.com/ndexbio/ndex-sync
NDEx Sync is used via the shell script /opt/ndex/lib/ndex-copier.sh
The script takes a single argument: a directory containing ‘copy plan' files.
bash ndex-copier.sh .
When run, the script reads and attempts to execute each copy plan file in the directory.
The NDEx Sync script can be run manually or can be executed periodically via cron or other scheduling facilities to copy new or modified networks from the source NDEx, creating or updating networks on the target NDEx.
NDEx Sync is like a file-mirroring utility, but with an important difference: the copied networks are not exact duplicates of the source networks.
Copied networks are assigned new UUIDs: every network stored in an NDEx server has a globally unique identifier and can be referenced by that identifier at its host NDEx.
NDEx Sync updates (or creates, if necessary) the network's provenance history, adding a "provenance event" that documents the fact of the copying.
The copied networks are therefore documented as distinct entities, copied at a specific time from a uniquely identified source. The provenance history provides a structure to document the events leading to the current state of a network. Applications using NDEx are not required to maintain the provenance history for networks that they manipulate, but it is encouraged as a standard practice and will be supported by NDEx utilities.
For each source network that is selected as a candidate for copying, NDEx Sync examines the provenance history of each network in the target account to determine:
Was this target network copied from the source network?
Is the target Out-Of-Date?
The default behavior of NDEx Sync is that it will copy the source network to the target account if there is no copy of the source network in the target account OR if the only copies are Out-Of-Date or have been modified.
The default behavior of NDEx Sync is conservative, never overwriting or deleting any network in the target directory. This behavior can be overridden by the copy plan parameter updateTargetNetwork, specifying that NDEx Sync should update target networks that are identified as unmodified, out-of-date copies of the specified source networks.
In an update, the target network keeps its UUID but its contents are replaced by the contents of the source network and the provenance history is handled in the same manner as in a default, non-update copy event. The updated network may be accessed by that UUID and any new request will obtain the updated content.
Using NDEx Sync to update networks is only appropriate for situations in which the target network is intended as a cache of the source, where users want to obtain the latest version of the source content and where they do not expect the content of the network to be consistent over time.
By default, updates will NOT be performed if the target network has readOnly == true. The updateReadOnlyNetwork configuration parameter in a copy plan overrides this behavior. This handles the case in which NDEx Sync is used to maintain a local copy of a remote resource and where the local copy is intended as a read-only reference.
The criteria for "out-of-date" are as follows:
Calculate latestSourceDate as the later of modification date and the last provenance history event end date for the source network.
Calculate earliestTargetDate as the earlier of modification date and the last provenance history event end date for the target network.
if latestSourceDate > earliestTargetDate, target is out-of-date
The lastModificationDate field of a network is updated when: 1) there is a change to any network element, including properties and presentation properties or 2) there is a change to intrinsic special "profile" properties (name, description)
The lastModificationDate does not update on:
Changes to provenance history, changes to permissions, changes to read-only status and changes to visibility.
All network elements, including properties, presentation properties are copied. In addition, the Provenance History is also copied and modified.
The following elements are not copied: permissions, visibility, UUID, modification time, creation time and readOnly status.
NDEx Sync ‘copy plans' specify:
An account and credentials for the source NDEx.
An account and credentials for the target NDEx.
The criteria to select networks on the source NDEx, which can be one of: 1) a query to find networks matching search text, 2) a query to find networks administered by an account AND matching search text or 3) a list of network UUIDs.
The updateTargetNetwork parameter
The updateReadOnlyNetwork parameter
NDEx Sync can only update networks in the target server account if the account specified by the username in the target element in the copy plan must have Administration privileges for the networks to be updated.
Source networks are identified based on their title, description, or content matching a query string. The user account for the source must have read access to each source network.
In the example copy plan below, networks matching "cal*" are copied from the public NDEx to the user2 account on an NDEx running on the local machine.
queryString: search text to find networks.
queryLimit: a maximum number of networks to copy is specified. This is useful largely as a brake on runaway copying – if the queryString matched some unanticipated, enormous number of networks, the script would still be limited.
{ "planType" : "QueryCopyPlan", "source" : { "route" : "http://www.ndexbio.org/v2", "username" : "user1", "password" : "pwd00123" }, "target" : { "route" : "http://myPrivateNDExServer.com/v2", "username" : "user2", "password" : "pwd980098" }, "queryString" : "cal*", "queryLimit" : "10", "updateTargetNetwork" : "false", "updateReadOnlyNetwork" : "false" }
sourceAccount: Source networks are limited to those administered by the specified account name.
To copy all the networks for a given account, the queryString can be "*"
In the copy plan example below, all networks (up to 10) from the user3 account are copied from the public NDEx to the user2 account on an NDEx running on the local machine.
{ "planType" : "QueryCopyPlan", "source" : { "route" : "http://www.ndexbio.org/v2", "username" : "user1", "password" : "pwd00123" }, "target" : { "route" : "http://myPrivateNDExServer.com/v2", "username" : "user2", "password" : "pwd980098" }, "queryString" : "*", "queryLimit" : "10", "queryAccountName" : "user3", "updateTargetNetwork" : "false", "updateReadOnlyNetwork" : "false" }
idList: list of UUIDs to identify source networks.
The user account for the source must have read access to each source network.
In this example, the network 5bca3218-28ca-11e4-9032-90b11c72aefa is copied from the public NDEx to the user2 account on an NDEx running on the local machine.
{ "planType" : "IdCopyPlan", "source" : { "route" : "http://www.ndexbio.org/v2", "username" : "user1", "password" : "pwd00123" }, "target" : { "route" : "http://myPrivateNDExServer.com/v2", "username" : "user2", "password" : "pwd980098 }, "idList" : [ "5bca3218-28ca-11e4-9032-90b11c72aefa" ], "updateTargetNetwork" : "false", "updateReadOnlyNetwork" : "false" }