1 / 14

Storage

Storage. Alessandra Forti Group seminar 16th May 2006. Introduction. Different applications, different data, different environments, different solutions. Root, paw ≠ Athena, BetaMiniApp or Dzero applications AOD ≠ user ntuples Tier2 ≠ department

cullen
Download Presentation

Storage

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Storage Alessandra Forti Group seminar 16th May 2006 Alessandra Forti

  2. Introduction • Different applications, different data, different environments, different solutions. • Root, paw ≠ Athena, BetaMiniApp or Dzero applications • AOD ≠ user ntuples • Tier2 ≠ department • Home directories ≠ user space on data servers • We don’t need to stick to one solution for everything • So I guess the basic fundamental questions to answer are: • What data do will we have on our storage? • How much space do we need? • What do we want from the our storage i.e redundancy, performance? • Can we obtain this in different ways? • How can we access this space? Alessandra Forti

  3. Solutions • Accessibility • SRM • GFAL, lcg-utils • AFS/NFS • Classic storage • RAID (Redundant Array of Inexpensive disks) on local machines • Grid Storage • Dcache • Xrootd • /grid (HTTP based file system) Alessandra Forti

  4. Accessibility • SRM: the grid middleware component whose function is to provide dynamic space allocation and file management on shared distributed storage systems • Manage space • Negotiate and assign space to users and manage lifetime of spaces • Manage files on behalf of user • Pin files in storage till they are released • Manage lifetime of files • Manage file sharing • Policies on what should reside on a storage or what to evict • Bring the files from remote locations • Manage multi-file requests • queue file requests, pre-stage Alessandra Forti

  5. Accessibility • GFAL is a library that can be linked to the applications to access data on a grid system. • It supports SRM APIs and the majority of grid protocols. • Lcg-utils also can talk to SRM to copy, replicate and list data. • Lcg utils are the way to copy data on a grid system and register the copy in the file catalogs. Alessandra Forti

  6. Accessibility • AFS/NFS (briefly) are shared file systems that can help sharing small amount of data. • AFS on a WAN would be really good if used for software distribution and I think ATLAs is supporting it. • NFS cannot be used outside the local site and it doesn’t scale very well with a number large (few hundred) of clients writing at the same time. Reading is fine. Alessandra Forti

  7. Classic storage • Classic storage consist in 1 or more data servers normally with RAIDED disks accessible by local machines normally via NFS. • Accessible some times (mostly at bigger labs) by remote machines via transfer protocols like scp, ftp or else, but not by applications for direct data reading. • There are no file catalogs attached. • Files are not replicated somewhere else • Need of local redundancy • The file name space is local and normally offered by NFS Alessandra Forti

  8. RAID • There are different RAID levels depending on the purpose. • Most used: RAID0, RAID1, RAID5 • RAID 0: clusters 2 or more disks, data are written in blocks (striped) across the disks, there is no redundancy. • enhanced read/write performance but no reliability if one disk dies all data are lost • Good for access of temporary data a WEB cache for example. • RAID 1: mirrors two or more disks • Exponentially enhanced reliability • Linearly enhanced read performance (data striping for reading but not for writing) • Partitions can be mirrored instead of disks • Good for servers: home dirs, WEB servers, Computing element, dcache head node, sw servers Alessandra Forti

  9. RAID • RAID 2,3,4 data are striped across the disks at respectively bit, bytes, block level • They have parity disks for reliability • Parity is a way of tracking changes using single bits or blocks of bits. Parity alone is not enough to do error recovery and reconstruction. • They are not very popular: if the parity disk dies the whole raid is unrecoverable. • They require minimum 3 disks • RAID 5 like RAID4 (block-level striping) but the parity is distributed across disks • Enhanced reliability parity and data blocks are distributed. • If one disk dies it can be rebuilt, if two die the whole array is lost. • In theory unlimited number of disks in practice it is better to limit them. • Poorer write performance due to the way parity must be maintained consistent with each write. • Raid 5 is what is normally used on data servers where reads are more frequent than writes. Alessandra Forti

  10. Grid Storage • Grid Storage consist in any device that has a space. data servers, Worker Nodes, tapes…. • It is accessible to local CPUs via a number of different protocols depending on what storage management software the site administrator has installed. • It is accessible from anywhere in the world to copy data in and out using grid utilities. • It has all the supported VO file catalogs attached • Files can be easily replicated at other sites • No real need for local redundancy • File name space has to span across multiple machines. • In Manchester we have 400 TB of distributed disks on the worker nodes. • Dcache xrootd and other solutions are a way to exploit it. Alessandra Forti

  11. dcache • Dcache has been developed by Fermi lab and DESY to deal with their tape storage system and the staging of data on disk but it has evolved in a more general storage system manager tool. • Advantages • It is SRM integrated so it has most of the space management features. • Combines several hundred nodes disks under a single file name space. • Load balance. • Data only removed if space is running short (no threshold) • Takes care that at least 'n' but not more than 'm' copies of a single dataset exists within one dCache instance. • Takes care that this rule is still true if nodes go down (schedules or even unexpected) Alessandra Forti

  12. dcache(3) • Disadvantages • It is not POSIX compliant files cannot be accessed as on a normal unix file system • Supported protocols are rewritten in dcache language • It is written in java • Sources are not available • The file name space is implemented using a database in the middle • Support is, for various reasons, inadequate • Unfortunately up to now it was the only solution availbable for a system like Manchester one • Other viable solutions could be xrootd and StoRM Alessandra Forti

  13. Xrootd(1) • XROOTD: file server which provides high performance file-based access. It was developed by BaBar/SLAC as an extension of rootd. It is now distributed as part of the standard ROOT. • It is now being adopted by two LHC experiments (Alice and CMS) • Advantages: • data are located within xrootd process there is no need of a database to catalog files on the system • It supports load balancing • xrootd determines which server is the best for client’s request to open a file • It is fault tollerant Fault tolerance feature • missing data can be again restored from other disks • Authorization plugin • resolve "trusted/untrusted" user for write access • Disadvantages • It is not integrated with SRM so all the space management isn’t there • lcg-utils and GFAL cannot talk to xrootd (yet) Alessandra Forti

  14. Discussion Alessandra Forti

More Related