1 / 22

WP2: Data Management

WP2: Data Management. Tutorial for PM9 Release RAL 31 st January 2002 Gavin McCance University of Glasgow. PM9 Release. Grid Data Mirroring Package (GDMP) * Basic replica management tool How-to… Spitfire Basic meta-data management prototype.

kevyn
Download Presentation

WP2: Data Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. WP2: Data Management Tutorial for PM9 Release RAL 31st January 2002 Gavin McCance University of Glasgow

  2. PM9 Release • Grid Data Mirroring Package (GDMP)* • Basic replica management tool • How-to… • Spitfire • Basic meta-data management prototype * Previously called ‘Grid Data Management Pilot’ Gavin McCance

  3. GDMP • Useful documentation and reference • WP2 web page: • http://grid-data-management.web.cern.ch/grid-data-management • GDMP page • http://cmsdoc.cern.ch/cms/grid • GDMP 2.0 manual • ‘GDMP User Instructions for the Testbed’ Gavin McCance

  4. GDMP • Version 2.0 (not the 2.0alpha) • Client-Server system for replicating files from one grid site to another • Subscription mechanism allows for automatic replication of files • Interfaced to the Grid Replica Catalogue (currently Globus MDS Replica Cat) Gavin McCance

  5. GDMP • Any file type can be transferred • Replication mechanisms assume read-only files – i.e. no update synchronisation • Particular plug-in for Objectivity • Handles update of local database Gavin McCance

  6. GDMP: Requirements • Tested on Linux RH6.1 and RH6.2 • Globus Toolkit 2.0 Alpha 9 • i.e. the EDG PM9 special release • GridFTP (NOT gsi-wuftp !) • g++ from gcc-2.91.66 or gcc-2.95.2 • RPM v3 or higher • Or.. Usual GNU make collection Gavin McCance

  7. THE GDMP EDG PM9 RPM* * One of these is not an acronym • Recommend this for UK testbed • DataGrid WP6 site • (Or.. Get original RPM from GDMP site) • Manual gives RPM, SRPM, and tarball installation instructions • All paths relative to GDMP_INSTALL_DIR • = /opt/edg in testbed release • Or the path from ./configure --prefix Gavin McCance

  8. Configuration • Full details in manual • Edit /opt/edg/etc/gdmp.conf. Set: • GDMP_INSTALL_DIR • GDMP_LOCAL_HOST & PORT • GLOBUS_LOCATION • If used: OBJECTIVITY stuff: binaries,boot file path, root directory Gavin McCance

  9. RepCat Configuration • http://www.globus.org/datagrid/deliverables/replicaGettingStarted.pdf • GDMP_REP_CAT_URL • =ldap://host2/rc=replica-catalogue,… • GDMP_REP_CAT_MANAGER_DN • =cn=RCManager, dc=host2, dc=cern, dc=ch • GDMP_REP_CAT_MANAGER_PWD • =secret Gavin McCance

  10. Inetd Configuration • As root: • configure_gdmp <install-dir> <userid> <port> • Updates /etc/services, /etc/inetd • Request served as ‘gdmp_server’ using: • GDMP_INSTALL_DIR/utils/gdmp_server_start • User manual Section 3.4 and Appendix A. Gavin McCance

  11. Server cert • GDMP requires a CA-signed server certificate to identify itself • Default issue is one from CERN • Not really secure, since anyone can download GDMP RPMs. • Get a new one from your CA if being used for production Gavin McCance

  12. GDMP client usage SiteA • A) su gdmp(or whatever user) • Currently client applications should run as same user as the server (given in /etc/inetd) • A) grid-proxy-init • B) Add gdmp server DN cert to mapfile! • A) setenv GDMP_CONFIG_FILE /opt/edg/etc/gdmp.conf • A)gdmp_ping hostb.ac.uk:2000 • “The GDMP server on hostb.ac.uk:2000 is listening” Site B Gavin McCance

  13. …GDMP usage Site A Site B • A,B) Start GDMP services (inetd) • B) Registers itself with site A • gdmp_host_subscribe hosta.ac.uk:2000 • A) New files Register them • gdmp_register_local_file -d /pool/files/ • This updates the local GDMP internal catalogue (on A) Gavin McCance

  14. …GDMP usage Site A Site B • A) Tell the world (well..all subscribed sites) • gdmp_publish_catalogue • Will update the import catalogue on all subscribed sites eg. The import catalogue on site B • By default, it will also publish the GDMP internal catalogue on the Globus Replica Catalogue Gavin McCance

  15. …GDMP usage Site A Site B • B) Get the new files from site A (and from any other sites to which B may be subscribed) • gdmp_replicate_get • Any new files on A will be transferred from site A  site B • Put in: GDMP_FLATFILE_ROOT_DIR as specified by gdmp.conf • By default, Globus Replica Catalogue is updated Gavin McCance

  16. Staging Support • Support for staging to and from MSS • GDMP server at B will be notified if there is some staging to be done at A and will drop connection. When staging is complete, B is notified by A, and can re-request the transfer. • GDMP: section 7. Gavin McCance

  17. Automation • Transfer waits until site B runs gdmp_replicate_get • However, when import catalogue is updated on B, a script is called GDMP_NOTIFICTION_FOR_PUBLISH_CATALOGUE • An example would be to run gdmp_replicate_getso the transfer happens automatically Gavin McCance

  18. RepCat C++ API • Described in Appendix D. • WP2 working with Globus on new distributed Replica Catalogue model • GIGGLE framework • Will attempt to keep existing APIs as much as possible! Gavin McCance

  19. Meta-data • Spitfire is a basic prototype • Purpose is the allow secure access to any SQL database over the grid • Secure access via HTTP(S) • Standard access (ie. Don’t need to know what the backend DB is) Gavin McCance

  20. Meta-data • Current implementation is via XSQL templates • http://hep-proj-spitfire.cern.ch/hep-proj-spitfire • Server side XSQL templates are ‘filled-in’ by attributes from an http GET or POST • Example.. Gavin McCance

  21. Meta-data • Template metatrig.xsql on server: • “select LFN from FileMetaData where TRIGGER=@trig and RUNNO>=@runmin and RUNNO<=@runmax” • An HTTP(S) request (eg. from a browser form) • http://meta1.atlas.rl.ac.uk/metatrig.xsql?trig=low1-a25&runmin=1100&runmax=1500 • Will return an XML or HTML encoded list of matching Logical File Names. • Good if you have a specific problem now! Gavin McCance

  22. Meta-data • Must maintain templates • Dependence on Oracle XSQL code • No client side APIs defined yet • It’s being rewritten for next release • Initially for new replica catalogue • + Proper authorisation + meta-data distribution + client side API Gavin McCance

More Related