Central Archive Model Sample Clauses
Central Archive Model. In this model, there is a ‘top-level’ EUDAT metadata service which duplicates some or all of the metadata from the communities updated via programmatic interfaces. If a single central archive is used it is unlikely to scale well given the problems of the number of information objects and heterogeneity if all of the metadata is duplicated. Even if only a subset of the metadata for all the objects (for instance, only metadata corresponding to Dublin Core) is held centrally, the number of data objects and the number of concurrent accesses could become an issue. The central service would also need to handle updating its information programmatically by querying the community servers. Where communities use OAI-MPH this is not too onerous, since after initial population, only change information is returned. For legacy data sets again, this is only a one off query since the information should be at least pseudo-static. However, for active communities which do not provide an OAI-MPH interface or similar, this query could be quite costly since the whole data set will be returned. This could be addressed by storing, for instance, only thematic information and searchable keywords and URLs to community metadata servers which would return results indicating which communities hold information of interest, similar to a Google keyword search. It would then be the responsibility of the user to query that community. While this does address the scalability of the central service it puts addition load on community metadata services which will not only be used within the community but by external users as well and may require additional access rights to the community service. For users wishing to use EUDAT as a science ‘drop-box’, they will be required to supply some minimum level of metadata, which would be stored in the central catalogue, although the physical storage may be at an EUDAT partner site.
