1 / 12

An iRODS-based Distributed Data Management System for CyberSKA

An iRODS-based Distributed Data Management System for CyberSKA. Cameron Kiddle , Arne Grimstrup , Russ Taylor – University of Calgary Venkat Mahadevan , Erik Rosolowsky – University of British Columbia Okanagan Olivier Eymere – IBM Canada.

vianca
Download Presentation

An iRODS-based Distributed Data Management System for CyberSKA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An iRODS-based Distributed Data Management System for CyberSKA Cameron Kiddle, Arne Grimstrup, Russ Taylor – University of CalgaryVenkatMahadevan, Erik Rosolowsky – University of British Columbia Okanagan Olivier Eymere– IBM Canada

  2. CyberSKA:Creating the cyberinfrastructure to support what will be the largest radio-telescope ever built: the Square Kilometre Array (SKA) Artists impression of the core of the SKA Image credit:SPDO / Swinburne Astronomy Productions

  3. CyberSKA Initiative to develop a scalable and distributed cyberinfrastructure platform to meet evolving science needs of current and future radio telescopes such as the SKA Led by the University of Calgary with several partner institutions from North America currently Canadian funding for CyberSKA provided by CANARIE, as part of their Network Enabled Platforms (NEP) program, and Cybera Starting by establishing cyberinfrastructure to support current large-scale astrophysical data needs generated by GALFACTS, PALFA and other high data volume SKA Pathfinder projects

  4. High Level Architecture

  5. Data Management System • Based on iRODS (Integrated Rule-Oriented Data System) • Abstracts data location from user • Supports data replication and cross-site backups • Efficient transfer of data using multiple TCP streams • Rule Engine to automate various tasks

  6. Storage Sites UBC WestGrid UofC* UBCO* McGill *currently deployed

  7. User Interface • User interface provided via CyberSKAPortal which is built on Elgg open source social networking platform • Distributed Data management system interfaces directly into Elgg File Module • User interface no different than if a local Elggfilestore was being used

  8. Authentication/Authorization • User authenticates with portal via login/password • Third party application authenticates with portal via OAuth (Portal exposes a RESTful API) • Access permissions for files maintained by portal (in Elgg database) • Permissions can be set to private, logged in, contacts, public, contact list, group • Portal authenticates with Data Management Service on behalf of user / application via Oauth (Data Management Service exposes a RESTful API) – user/application redirected to Data Management Service

  9. Metadata • Title, description, tags, file permissions, iRODS file handle for files of all types stored in portal (Elgg) database • Extra metadata for certain file types (currently just FITS files) stored in separate metadata database associated with the data management system • Spatially enabled PostgreSQL/PgSphere database • Various metadata from FITS file extracted upon ingestion • Metadata schema based on IVOA Resource Metadata recommendations • Data Query Service enables spatial, temporal and spectral queries

  10. File Upload / Download • Simple file upload/download • Use normal Elgg Upload/Download mechanisms • Single file at a time • Suitable for smaller uploads/downloads • Advanced upload/download tools – for more efficient, reliable transfers of larger files • Use JUpload for uploads • Java applet • Supports http and ftp • Allows for multi-file upload • Use CADC (Canadian Astronomy Data Centre) Download Manager for downloads • Java Web Start based • Supports http

  11. Status and Next Steps • Status: • Prototype environment currently deployed at two sites (UofC, UBCO) • Next Steps: • Complete testing • Integrate with production portal • Set up storage nodes at other participating sites • Add support for Measurement Sets in metadata database • Add IVOA compliant interface

  12. Contact Information Portal: http://www.cyberska.org/ E-mail: info@cyberska.org

More Related