1 / 23

Federating Grid and Cloud S torage in EUDAT

Federating Grid and Cloud S torage in EUDAT. International Symposium on Grids and Clouds 2014, 23-28 March 2014. Shaun de Witt, STFC Maciej Brzeźniak, PSNC Martin Hellmich , CERN. Agenda. Introduction … … … Test results Future work. Introduction.

oberon
Download Presentation

Federating Grid and Cloud S torage in EUDAT

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Federating Grid and Cloud Storagein EUDAT International Symposium on Grids and Clouds 2014, 23-28 March 2014 Shaun de Witt, STFC Maciej Brzeźniak, PSNC Martin Hellmich, CERN

  2. Agenda Introduction … … … Test results Future work 3rd EUDAT Technical meeting in Bologna 7th February 2013

  3. Introduction • We present and analyze the results ofGridand CloudStorage integration • In EUDAT we used: • iRODS as GridStorage federationmechanism • OpenStack Swift as scalableobjectstoragesolution • Scope: • Proof of concept • Pilot OpenStack Swift installation in PSNC • ProductioniRODSserversin PSNC (Poznan) and EPCC (Edinburgh) 3rd EUDAT Technuical meeting in Bologna 7th February 2013

  4. EUDAT projectintroduction • Partners:data center & communities: • pan-European Data Storage & mgmtinfrastructure • Long term data preservation: • Storage safety, availability – replication, integritycontrol • Data Accessibility – visibility, possibility to referoveryears 3rd EUDAT Technuical meeting in Bologna 7th February 2013

  5. EUDAT challenges: Picture showingvariousstoragesystems federated underiRODS 3rd EUDAT Technuical meeting in Bologna 7th February 2013 • Federateheterogeneous data management systems: • dCache, AFS, DMF, GPFS, SAM-FS • File systems, HSMs, file servers • Object Storage systems(!) whileensuring: • Performance, scalability, • Data safety, durability, HA, fail-over • Uniqueaccess, Federationtransparency, • Flexibility (ruleengine) • Implement the core services: • safe and long-term storage: B2SAFE, • efficient analysis: B2STAGE, • easy deposit & sharing: B2SHARE, • Data & meta-data exploration: B2FIND.

  6. EUDAT CDI domain of registered data:

  7. Grid – Cloudstorageintegration • Need to integrateGrids and Cloud/Object storage • Gridsgetanother, cost-effective, scalablebackend • Many institutions and initiativesaretesting & using in productionobjectstorageincluding • Most Cloud Storage use Object Storage concept • Object Storage solutionshavelimitedsupportfor federationthatiswelladdressed in Grids • In EUDAT we integrated: • objectstoragesystem – OpenStack Swift • iRODSservers and federations 3rd EUDAT Technuical meeting in Bologna 7th February 2013

  8. Context: Object Storage Concept • The conceptenablesbuildinglow-cost, scalable, efficientstorage: • Within data centre • DR / distributedconfigurations • Reliabilitythanks to redundancy of components: • Many cost-efficientstorageserversw/ diskdrives(12-60 HDD/SSD) • Typical (cheap) network: 1/10 GbitEthernet • Limitationsof traditionalappraoches: • High investmentcost and maintenance • Vendor lock-in, Closed architecture, Limitedscalability • Slow adoption of new technologiesthan in commodity market

  9. Context: Object Storage importance • Many institutions and initiatives(DCs, NRENs, companies, R&D projects)aretesting & using in productionobjectstorageincluding: • Open source/ privatecloud: • Open Stack Swift • Ceph/ RADOS • Sheepdog, Scality… • Commercial: • Amazon S3, RackSpaceCloudFiles… • MS Azzure Object Storage… • Most promising open source: Open Stack Swift & Ceph

  10. APP User Apps Client HOST / VM MDS OSDs MONs MDS.1 OSD.1 MON.1 Upload Download RBD ...... ...... RadosGW CephFS ...... Load balancer MDS.n MON.n OSD.n LibRados Proxy Node Proxy Node Proxy Node Rados Storage Node Storage Node Storage Node Storage Node Storage Node Object Storage: Architectures CEPH OpenStackSwift

  11. Object Storage: concepts: OpenStackSwift Ring Ceph’s map • No meta-data lookups, no meta-data DB!, data placement/locationcomputed! • Swift: Ring:represents the space of all possible computed hash values divided in equivalent parts (partitions); partitionsarespreadacrossstoragenodes • Ceph: CRUSH map:list of storage devs, failure domain hierarchy (e.g., device, host, rack, row, room) and rules for traversing the hierarchy when storing data. Source: The Riak Project Source: http://www.sebastien-han.fr/blog/2012/12/07/ceph-2-speed-storage-with-crush/

  12. Object Storage concepts: no DB lookups! OpenStackSwift Ring Ceph’s map • No meta-data lookups, no meta-data DB!, data placement/locationcomputed! • Swift: Ring:represents the space of all possible computed hash values divided in equivalent parts (partitions); partitionsarespreadacrossstoragenodes • Ceph: CRUSH map:list of storage devs, failure domain hierarchy (e.g., device, host, rack, row, room) and rules for traversing the hierarchy when storing data. Source: The Riak Project Source: http://www.sebastien-han.fr/blog/2012/12/07/ceph-2-speed-storage-with-crush/

  13. Grid – Cloudstorageintegration • Most cloud/objectstoragesolutionsexpose: • S3 interface • Other native interfaces: OSS: Swift; Ceph: RADOS • S3 (by Amazon) is de facto standard in cloudstorage: • Many PetaBytes, Global systems • Vendorsuseit (e.g. Dropbox) orprovidesit • Largetakeup • Similarconcepts: • CDMI: Cloud Data Management Interface – SNIA standard, not manyimplementationshttp://www.snia.org/cdmi • Nimbus.IO: https://nimbus.io • MS-Azzureblob Storage:http://www.windowsazure.com/en-us/manage/services/storage/ • RackSpaceCloudFiles:www.rackspace.com/cloud/files/ 3rd EUDAT Technuical meeting in Bologna 7th February 2013

  14. S3 and S3-like in commercialsystems: o • S3 re-sellers: • Lots of services • IncludingDropbox • Services similar to S3 concept: • Nimbus.IO: https://nimbus.io • MS-Azzureblob Storage:http://www.windowsazure.com/en-us/manage/services/storage/ • RackSpaceCloudFiles:www.rackspace.com/cloud/files/ • S3 implementations ‚in the hardware’: • Xyratex • Amplidata 3rd EUDAT Technuical meeting in Bologna 7th February 2013

  15. Why build PRIVATE S3-likestorage? • Features/ benefits: • Reliable storage on top of commodityhardware • User stillable to control the data • Easy scalability, possible to grow the system • Addingresources and redistributing data possible in non-disruptiveway • Open source software solutions and standardsavailable: • e.g. OpenStack Swift: Open Stack Native API and S3 API • Other S3-enabled storage: e.g. RADOS • CDMI: Cloud Data Management Interface 3rd EUDAT Technuical meeting in Bologna 7th February 2013

  16. Why to federate iRODS with S3/OpenStack? • We were asked toconsidercloudstorage: • From EUDAT 1st year review report: • Some communities have data stored in OpenStack • VPH isbuilding reliable storage cloud on top of OpenStack Swift within pMedicine project (together with PSNC) • These data should be available to EUDAT • Data Staging: Cloud -> EUDAT -> PRACE HPC and back • Data Replication: Cloud -> EUDAT -> other back-end storage • We couldapplyruleengine to data in the cloud, assignPIDs 3rd EUDAT Technuical meeting in Bologna 7th February 2013

  17. VPH caseanalysis: iRODSclient Data ingestion Dataingestion Dataaccess Dataaccess Data access EUDAT’siRODSfederation Replication Data Staging iRODSserver iRODSserver iRODSserver storage driver otherstorage driver S3 driver PIDassigned Regi-stration HPC system Storage system Data ingestion S3/OSSclient OSS API S3 API EUDAT’sPID Service Data access

  18. Our 7.2 project • Purpose: • To examineexisting iRODS-S3 driver • (possibly) to improveit / provideanother one • Steps/status: • 1st stage: • Play with whatisthere – done for OpenStack/S3 + iRODS • Examinefunctionality • Evaluatescalability – foundsomeissuesalready • Follow-up • Try to improve the existing S3 driver • Functionality • Performance • Implement native Open Stack driver? • Get in touch with iRODSdevelopers 3rd EUDAT Technuical meeting in Bologna 7th February 2013

  19. iRODS-OpenStack tests iRODS server(s) S3/OpenStack API S3 API TEST SETUP: • iRODS server: • Cloud as compoundresources • Disk cachein front ofit • Open Stack Swift: • 3 proxies, 1 with S3 • 5 storage nodes • Extensivefunctionality and perf. tests • Amazon S3: • Onlylimitedfunctionalitytests 3rd EUDAT Technuical meeting in Bologna 7th February 2013

  20. iRODS-OpenStack test TEST RESULTS: • S3 vs native OSS overhead • Upload: ~0% • Download: ~8% • iRODS overhead: • Upload: ~19% • Download: • From compound S3: ~0% • Cached: SPEEDUP: 230% (cache resources faster than S3)

  21. iRODS-OpenStack test

  22. Conclusions and future plans: • Conclusions • Performance-wise iRODS does not bring much overhead – files <2GB • Problems arise for files >2GB – no support for multipart upload in iRODS-S3 driver – thispreventsiRODS from storingfiles >2GB in clouds • Some functional limits (e.g. imv problem) • Using iRODS to federate S3 clouds in large scalewould require improving the existing or developing a new driver • Future plans: • Test the integration with VPH’scloudusingexisting driver • Ask SAF for supporting the driver development • Get in touch with iRODSdevelopers to assure the sustainability of ourwork

  23. Object storage on top of iRODS? • Problems: • Data organisationmapping: • * filesystem vs objects • * big files vs fragments • Identity mapping? • * S3 keys/accounts vs X.509? • Out of scope of EUDAT? • * a lot of workneeded iRODSclient Dataingestion Dataaccess S3 API iRODS API EUDAT’siRODSfederation iRODSserver iRODSserver otherstorage driver S3 driver Otherstorage S3/OSSclient S3 API Storage system Storage system Data Access/ingest

More Related