1 / 14

Lifecycle Support for Scientific Investigations

Lifecycle Support for Scientific Investigations Integrating Data, Computing and Workflows with DEEDS. SMARTI National Bridge Inventory Dataset. Ann Christine Catlin Research Computing, Purdue University Bridging Big Data Workshop October 14, 2019.

Leo
Download Presentation

Lifecycle Support for Scientific Investigations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lifecycle Support for Scientific Investigations Integrating Data, Computing and Workflows with DEEDS SMARTI National Bridge Inventory Dataset Ann Christine Catlin Research Computing, Purdue University Bridging Big Data Workshop October 14, 2019

  2. Infrastructure to support scientific workflows Data organization & collection Data preservation Metadata & standards Data sharing Software/algorithm preservation Computing services (including HPC) Connecting data to computation Workflow tracking Data provenance Results traceability Reproducibility Applications for analytics & exploration Publication for discovery & reuse

  3. Advanced infrastructure & research funding NSF is looking for advanced infrastructures that integrate data + computing Since 2011, DMPs required for all proposals Now calling for infrastructure solutions that support both data + computing, support reproducibility, are adaptable to new technologies, and are effective across science domains. DEEDS Datasets EU Open Data Portal

  4. Digital Environment for Enabling Data-driven Science

  5. Cases: Structuring & organization of the dataset

  6. Files: Repository classification by type, format, use

  7. DataTables: Data representation & exploration

  8. DataTables: Data collection, metadata & operations

  9. Tools: Execution & computing workflows

  10. Tools: Definition & integration

  11. Analytics: R-based statistical analysis & graphics

  12. Analytics: R-based statistical analysis & graphics

  13. Now publicly available at https://deedshub.org Future work: sensor data acquisition, cloud-based data storage framework, access and use of data from external repositories, complex computing workflows (e.g., transcriptome sequencing), continuous processing for streaming data, advanced visualization packages (e.g., visual molecular dynamics), analytics data tables merge & pivot, R-based statistical toolkit for analytics, interactive dashboard for exploration and use of data & tools in published datasets Ask about collaborating on research projects & proposals acc@purdue.edu

  14. Acknowledgements DEEDS R&DDEEDS Research Partners Lead: Chandima Hewa NadungodageSolar PV Research Team Guneshi Wickramaarachchi Prof. Ashraful Alam, Tahir Patel, Reza Asadpour, Xingshu Sun Sumudinie Fernando Ecotoxicology Research Team Steven Clark Prof. Marisol Sepulveda, Tyler Hoskins, Michael Iacchetta, Chloe De Perre, Robert Flynn, Grace Coogan, Edgar Perez Computer Science Graduate Students Computational Chemistry Research Team Andres Bejarano Prof. Joseph Francisco, Ross Hoehn, Jie Zhong Paramesh Desigavinayagam Nutrition Science Research Team Keehwan Park Prof. Connie Weaver, Kalina Hodges, Sisi Cao SMARTI Research Team Prof. Robin Gandhi, Prof. Chungwook Sim, Prof. Daniel Linzell, Prof. Yashar Eftekhar Azam, Akshay Kale Forest Biodiversity Research Team Prof. Jingjing Liang Earthquake Engineering Research Team Prof. Santiago Pujol, Jonathan Monical

More Related