1 / 16

Grid Data Management Vered Kunik - I sraeli Grid NA3 Team

Grid Data Management Vered Kunik - I sraeli Grid NA3 Team. Israeli Grid Workshop , Tel Aviv, Israel, Feb 2006. EGEE is a project funded by the European Union under contract IST-2003-508833. Outline. Introduction Grid Data Management Services File catalogues Data Management commands

kenda
Download Presentation

Grid Data Management Vered Kunik - I sraeli Grid NA3 Team

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Grid Data Management Vered Kunik - Israeli Grid NA3 Team Israeli GridWorkshop, Tel Aviv, Israel, Feb 2006 EGEE is a project funded by the European Union under contract IST-2003-508833

  2. Outline • Introduction • Grid Data Management Services • File catalogues • Data Management commands • Hands on IV Workshop INFN Grid, Bari, 25-27.10.2004 - 2

  3. Introduction • The Input / Output Sandbox is used for transferring relatively small files (< 20 MB) • “Large” files are stored in permanent resources called SE = Storage Elements • SE are present at almost every site together with the computing resources • Users and applications produce and require data • Data may be stored on Grid files Users and applications need to handle files on the Grid IV Workshop INFN Grid, Bari, 25-27.10.2004 - 3

  4. Grid Data Management Services Grid Data Management Services should enable users to: • move files in and out of the Grid • Replicate files on different SE’s • Locate files on various SE’s Data Management means movement and replication of files across/on grid elements IV Workshop INFN Grid, Bari, 25-27.10.2004 - 4

  5. Grid Data Management Services – cont’d • By using high level data management tools the transport layer details (protocols) , the storage location and the internal structure of the SE’s and transparent • Data transfer is done by a number of protocols (gsiftp, rfio, file, etc`) • Usage of a central file catalogue The SE is a “black box” IV Workshop INFN Grid, Bari, 25-27.10.2004 - 5

  6. Files & replicas: name conventions • Logical File Name (LFN) • An alias created by the user to refer to some file • A LFN is of the form: lfn:/grid/<MyVO>/<MyDirs>/<MyFile> • Example: lfn:/grid/gilda/importantResults/Test1240.dat • Globally Unique Identifier (GUID) • A file can always be identified by its GUID (based on UUID) • A GUID is of the form: guid:<unique_string> • All replicas of a file will share the same GUID • Example: “guid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6” both lfn’s and guid’s refer to files (notreplicas) IV Workshop INFN Grid, Bari, 25-27.10.2004 - 6

  7. Files & replicas – cont’d • Storage URL (SURL) • (AKA: Physical/Storage File Name (PFN/SFN)) • Used by the RMS to find where the replica is physically stored • A SURL is of the form: sfn://<SE_hostname>/<VO_path>/<file_name> • Example: sfn://tbed1.cern.ch/flatfiles/SE00/gilda/project1/testSUTL.dat • Transport URL (TURL) • Temporary locator of a physical replica including the access protocol understood by a SE • A TURL is of the form: <protocol>://<SE_hostname>/<VO_path>/<filename> • Example: gsiftp://tbed1.cern.ch/gilda/project1/testTURL.dat provide info about the physical location of the replica IV Workshop INFN Grid, Bari, 25-27.10.2004 - 7

  8. File Catalogs So… • How do I keep track of all my files on the Grid? • Even if I remember all the lfn’s of my files, what about someone else's files ? • How does the Grid keep track of associations lfn/guid/surl ? • Well… for that we have a FILE CATALOG IV Workshop INFN Grid, Bari, 25-27.10.2004 - 8

  9. Logical File Name 1 Physical File SURL 1 Logical File Name 2 GUID Physical File SURL n Logical File Name n File Catalogs – cont’d IV Workshop INFN Grid, Bari, 25-27.10.2004 - 9

  10. Replica Replica Replica srm://host.example.com/foo/bar host.example.com srm://host.example.com/foo/bar host.example.com srm://host.example.com/foo/bar host.example.com Symlink Symlink /grid/dteam/mydir/mylink /grid/dteam/mydir/mylink User Metadata System Metadata User Defined Metadata “size” => 10234 “cksum_type” => “MD5” “cksum” => “yy-yy-yy” LFN GUID Xxxxxx-xxxx-xxx-xxx- /grid/dteam/dir1/dir2/file1.root Symlink Replica /grid/dteam/mydir/mylink srm://host.example.com/foo/bar host.example.com File Catalogs – cont’d • LFN acts as main key in the database. It has: • Symbolic links to it (additional LFNs) • Unique Identifier (GUID) • System metadata • Information on replicas IV Workshop INFN Grid, Bari, 25-27.10.2004 - 10

  11. Data Management JDL attributes • InputData The lfn’s / guid’s needed by the job as an input to the process • DataAccessProtocol The list of protocols that the application is able to “speak” for accessing files listed in the InputData • OutputSE location of a SE where the output data will be stored *attributes are optional(will be demonstrated during the hands on) IV Workshop INFN Grid, Bari, 25-27.10.2004 - 11

  12. Data Management commands • lcg-cp Copies a Grid file to a local destination • lcg-cr Copies a file to a SE and registers the file in the LRC • lcg-del Deletes one file (either one replica or all replicas) • lcg-lg Gets the guid for a given lfn or surl IV Workshop INFN Grid, Bari, 25-27.10.2004 - 12

  13. Data Management commands –cont’d • lcg-rep Copies a file from SE to SE and registers it in the LRC • lcg-aa Adds an alias in RMC for a given guid • lcg-la Lists the aliases for a given LFN, GUID or SURL • lcg-gt Gets the turl for a given surl and transfer protocol IV Workshop INFN Grid, Bari, 25-27.10.2004 - 13

  14. Data Management commands – cont’d • lcg-lr Lists the replicas for a given lfn, guid or surl • lcg-ra Removes an alias in RMC for a given guid • lcg-rf Registers a SE file in the LRC (optionally in the RMC) • lcg-uf Un-registers a file residing on an SE from the LRC IV Workshop INFN Grid, Bari, 25-27.10.2004 - 14

  15. Data Management commands – cont’d • lfc-ls List file/directory entries in a directory. • lfc-mkdir Create directory. • lfc-rename Rename a file/directory. • lfc-rm Remove a file/directory. • lfc-chmod Change access mode of a file/directory • lfc-chown Change owner and group of a file/directory IV Workshop INFN Grid, Bari, 25-27.10.2004 - 15

  16. Data Management tutorial IV Workshop INFN Grid, Bari, 25-27.10.2004 - 16

More Related