140 likes | 274 Views
Storage. Alessandra Forti Group seminar 16th May 2006. Introduction. Different applications, different data, different environments, different solutions. Root, paw ≠ Athena, BetaMiniApp or Dzero applications AOD ≠ user ntuples Tier2 ≠ department
E N D
Storage Alessandra Forti Group seminar 16th May 2006 Alessandra Forti
Introduction • Different applications, different data, different environments, different solutions. • Root, paw ≠ Athena, BetaMiniApp or Dzero applications • AOD ≠ user ntuples • Tier2 ≠ department • Home directories ≠ user space on data servers • We don’t need to stick to one solution for everything • So I guess the basic fundamental questions to answer are: • What data do will we have on our storage? • How much space do we need? • What do we want from the our storage i.e redundancy, performance? • Can we obtain this in different ways? • How can we access this space? Alessandra Forti
Solutions • Accessibility • SRM • GFAL, lcg-utils • AFS/NFS • Classic storage • RAID (Redundant Array of Inexpensive disks) on local machines • Grid Storage • Dcache • Xrootd • /grid (HTTP based file system) Alessandra Forti
Accessibility • SRM: the grid middleware component whose function is to provide dynamic space allocation and file management on shared distributed storage systems • Manage space • Negotiate and assign space to users and manage lifetime of spaces • Manage files on behalf of user • Pin files in storage till they are released • Manage lifetime of files • Manage file sharing • Policies on what should reside on a storage or what to evict • Bring the files from remote locations • Manage multi-file requests • queue file requests, pre-stage Alessandra Forti
Accessibility • GFAL is a library that can be linked to the applications to access data on a grid system. • It supports SRM APIs and the majority of grid protocols. • Lcg-utils also can talk to SRM to copy, replicate and list data. • Lcg utils are the way to copy data on a grid system and register the copy in the file catalogs. Alessandra Forti
Accessibility • AFS/NFS (briefly) are shared file systems that can help sharing small amount of data. • AFS on a WAN would be really good if used for software distribution and I think ATLAs is supporting it. • NFS cannot be used outside the local site and it doesn’t scale very well with a number large (few hundred) of clients writing at the same time. Reading is fine. Alessandra Forti
Classic storage • Classic storage consist in 1 or more data servers normally with RAIDED disks accessible by local machines normally via NFS. • Accessible some times (mostly at bigger labs) by remote machines via transfer protocols like scp, ftp or else, but not by applications for direct data reading. • There are no file catalogs attached. • Files are not replicated somewhere else • Need of local redundancy • The file name space is local and normally offered by NFS Alessandra Forti
RAID • There are different RAID levels depending on the purpose. • Most used: RAID0, RAID1, RAID5 • RAID 0: clusters 2 or more disks, data are written in blocks (striped) across the disks, there is no redundancy. • enhanced read/write performance but no reliability if one disk dies all data are lost • Good for access of temporary data a WEB cache for example. • RAID 1: mirrors two or more disks • Exponentially enhanced reliability • Linearly enhanced read performance (data striping for reading but not for writing) • Partitions can be mirrored instead of disks • Good for servers: home dirs, WEB servers, Computing element, dcache head node, sw servers Alessandra Forti
RAID • RAID 2,3,4 data are striped across the disks at respectively bit, bytes, block level • They have parity disks for reliability • Parity is a way of tracking changes using single bits or blocks of bits. Parity alone is not enough to do error recovery and reconstruction. • They are not very popular: if the parity disk dies the whole raid is unrecoverable. • They require minimum 3 disks • RAID 5 like RAID4 (block-level striping) but the parity is distributed across disks • Enhanced reliability parity and data blocks are distributed. • If one disk dies it can be rebuilt, if two die the whole array is lost. • In theory unlimited number of disks in practice it is better to limit them. • Poorer write performance due to the way parity must be maintained consistent with each write. • Raid 5 is what is normally used on data servers where reads are more frequent than writes. Alessandra Forti
Grid Storage • Grid Storage consist in any device that has a space. data servers, Worker Nodes, tapes…. • It is accessible to local CPUs via a number of different protocols depending on what storage management software the site administrator has installed. • It is accessible from anywhere in the world to copy data in and out using grid utilities. • It has all the supported VO file catalogs attached • Files can be easily replicated at other sites • No real need for local redundancy • File name space has to span across multiple machines. • In Manchester we have 400 TB of distributed disks on the worker nodes. • Dcache xrootd and other solutions are a way to exploit it. Alessandra Forti
dcache • Dcache has been developed by Fermi lab and DESY to deal with their tape storage system and the staging of data on disk but it has evolved in a more general storage system manager tool. • Advantages • It is SRM integrated so it has most of the space management features. • Combines several hundred nodes disks under a single file name space. • Load balance. • Data only removed if space is running short (no threshold) • Takes care that at least 'n' but not more than 'm' copies of a single dataset exists within one dCache instance. • Takes care that this rule is still true if nodes go down (schedules or even unexpected) Alessandra Forti
dcache(3) • Disadvantages • It is not POSIX compliant files cannot be accessed as on a normal unix file system • Supported protocols are rewritten in dcache language • It is written in java • Sources are not available • The file name space is implemented using a database in the middle • Support is, for various reasons, inadequate • Unfortunately up to now it was the only solution availbable for a system like Manchester one • Other viable solutions could be xrootd and StoRM Alessandra Forti
Xrootd(1) • XROOTD: file server which provides high performance file-based access. It was developed by BaBar/SLAC as an extension of rootd. It is now distributed as part of the standard ROOT. • It is now being adopted by two LHC experiments (Alice and CMS) • Advantages: • data are located within xrootd process there is no need of a database to catalog files on the system • It supports load balancing • xrootd determines which server is the best for client’s request to open a file • It is fault tollerant Fault tolerance feature • missing data can be again restored from other disks • Authorization plugin • resolve "trusted/untrusted" user for write access • Disadvantages • It is not integrated with SRM so all the space management isn’t there • lcg-utils and GFAL cannot talk to xrootd (yet) Alessandra Forti
Discussion Alessandra Forti