GridFTP DSI Component Introduction Clause Samples

GridFTP DSI Component Introduction. As reported in the previous sections, because of the requirement to have a GridFTP interface in front of iRODS to perform massive data transfers, the EUDAT project initially adopted a JAVA based GridFTP server, called ▇▇▇▇▇▇▇. However, the last release, v.0.9.0, of this component showed significant performance degradation due to a bug in the Jargon library itself. Moreover, the behaviour of ▇▇▇▇▇▇▇ was not consistent with the event-triggering model of iRODS40. Since it was not possible to fix these bugs in the short term, we decided to extend the Globus GridFTP server to replace ▇▇▇▇▇▇▇. In fact by using the Data Storage Interface (DSI)41 the Globus GridFTP server can be relatively easily extended.. This new module consists of C-based functions, which, through the iRODS C API, can interact with iRODS. The main operations that are supported are “get”, “put”, “delete” and “list”. The GridFTP server can load the module at start-up time through a specific command line option, and therefore no changes are required in the GridFTP server typical configuration. This simplifies software support of the module as it is decoupled from future changes in the server. From the security point of view, the authentication is based on X509 certificates (GSI authentication) and a single-sign-on mechanism has been implemented between the GridFTP server and the iRODS server. If the GridFTP server is running as a privileged user, the usual Globus security is in place and it is necessary to map the GridFTP user names into iRODS ones explicitly. Otherwise, in the case of unprivileged users, it is possible to simplify the configuration, because the iRODS server trusts that the GridFTP server is relying just on its host certificate. From the performance point of view, the transfer speed of the GridFTP server using the iRODS DSI module is a great improvement in comparison to that of ▇▇▇▇▇▇▇ (see I.2). 40 ▇▇▇▇▇▇▇ creates an empty file and then “re-opens” it to write the contents, triggering the object creation event inside iRODS twice. This event causes the execution of a certain number of rules, which should not be repeated two times for each object. 41 ▇▇▇▇://▇▇▇▇▇▇.▇▇▇.▇▇/toolkit/docs/5.2/5.2.4/gridftp/developer/#idp5046480
GridFTP DSI Component Introduction. The following diagram reflects the interaction process between iRODS and the GridFTP server with the iRODS DSI module. In the picture the required configuration files are included too: gridmapfile and irodsresourcemap.conf. The GridFTP DSI supports the following iRODS functionalities:  Connection to an iRODS space (reading information from .irodsEnv file)  Data-objects creation and writing (managing iRODS resources)  Data-objects checksum calculation  Collections creation  Data-objects opening and reading  Data-objects removal  Collection removal  Query to the iRODS database (ICAT)  Collections listing These functionalities are sufficient and necessary to handle all the operations made available by the main GridFTP clients, such as:  PUT: store files and folders into iRODS: ‒ single file ‒ multi files ‒ folder recursively ‒ managing iRODS resources ‒ with checksum calculation  GET: get data-objects and collections from iRODS: ‒ single file ‒ multi files ‒ folder recursively ‒ with checksum calculation  DELETE: remove data-objects and collection from iRODS: ‒ single file ‒ multi files ‒ folder recursively  LIST: list data objects and collections stored in iRODS ‒ data-objects size ‒ data-objects modification time The GridFTP DSI module is implemented using the iRODS 3.2 C API starting from a DSI stub, which can be generated directly through the Globus Toolkit. When the GridFTP server receives a request (e.g. data transfer, directory creation, files browsing, etc), it forwards it to the underneath DSI module which implements the request interacting with the iRODS instance via its API functions. The GridFTP server is kept informed about any progress and/or problems. To properly handle the connection with the iRODS instance, the DSI utilizes the variables written in the .irodsEnv file; by default the iRODS environment file is located into the home directory “~/.irods/.irodsEnv” of the user who launched the GridFTP server. If necessary, the default location can be changed to point to a different file using the irodsEnvFile environment variable. The integration of a GridFTP server with iRODS also implies some changes to the configuration of the security layer to permit the two systems to interact each other. When the GridFTP server receives a connection, the inetd process (UNIX System V) forks the process and replaces the process’s owner from root to a non-privileged user for security reasons. This approach introduces a big restriction because – for...