The data staging script Clause Samples
The data staging script. To support scientific communities in integrating the data staging service with their existing tools, a sample Python script was developed to demonstrate how to instrument remote data transfers using available APIs. The script is composed of two different modules: the first, called selector, is responsible for the selection of the data sets to be staged, while the second, called transfer, executes the real transfer using the preferred protocol. The modules can be replaced to support alternative transfer protocols or other methods for selecting data. The current implementation (which is based on EPOS requirements but can be extended easily) utilizes an iRODS micro-service for the selection of the data sets and the Globus On-line API for their transfer. The script takes various parameters as input, including the path of the archive (-p), the criteria for the selection of the data sets (-year, -network, -station, -channel), the username (-u), the source and destination sites (--ss, --ds), and the destination directory onto which the data is staged (-dd). Once executed, the progress of the transfer can be monitored using the Globus On-line web interface (see Figure 19). 27The PRACE Security Team, in collaboration with the Globus team, is currently working to address this point.
