1 / 17

EarthCube Layered Architecture C oncept A ward

EarthCube Layered Architecture C oncept A ward. Interoperability Mechanisms. Layered Architecture Concept Award. Reagan Moore (UNC-CH/DICE) Collaboration environments Ilkay Altintas ( UCSD) Workflows David Arctur (OGC) Web services

Download Presentation

EarthCube Layered Architecture C oncept A ward

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. EarthCube Layered Architecture Concept Award Interoperability Mechanisms

  2. Layered Architecture Concept Award Reagan Moore (UNC-CH/DICE) Collaboration environments IlkayAltintas (UCSD) Workflows David Arctur (OGC) Web services Lawrence Band (UNC-CH/IE) Eco-hydrology modeling Liping Di (GMU) Geospatial knowledge building Janet Fredericks (WHOI) Data quality Jeff Horsburgh (Utah State University) CUAHSI / DataONE Yong Liu (UIUC / NCSA) Workflows / Cyberintegrator Chris MacDermaid (Colorado State Univ.) Physics model frameworks Brian Miles (UNC-CH/IE) Eco-hydrology workflows Michael Schoffner (RENCI) Web service integration Antoine de Torcy (UNC-CH/DICE) Workflow integration WeiguoHan (GMU) GeoBrain

  3. Loosely Coupled – Layered Architecture 1 Research Environment- Applications , Workflows Policies 2 Collaboration Environment – Data Grids, Portals Policies Policies Policies Protocols Web Services --------------------- Protocols Brokers / Messaging / Structured Object Manipulation EarthCube Infrastructure Protocols Web Services --------------------- Protocols 3 Community Resources

  4. Interoperability Mechanisms • For interactions with a collaboration environment • Register a remote file into the collaboration space • Collection of links Soft link • Operations on a remote file • THREDDS, OpeNDAP, NetCDF, HDF5, FITS, … Posix I/O extensions • Access (get, put) a remote file • Web service invoking remote protocol Micro-services • Asynchronously post and read messages • Message queue forwarding (AMQP) Queuing • Operations on aggregations of files • Operations associated with a collection Posix I/O extensions • Operations on aggregations of procedures • Workflows and structured information exchange Rule exchange • Policy enforcement • Policy-encoded objects Policy exchange

  5. Use Case Collaborations • Register a remote file into the collaboration space • DataNet Federation Consortium data grid (DFC) • Operations on a remote file • THREDDS, OpeNDAP, NetCDF, HDF5, storage drivers for OOI • Access (get, put) a remote file • DataONE, CUAHSI, (OGC, Data Conservancy) • Asynchronously post and read messages • SEAD - VIVO • Operations on aggregations of files • OOI time series archive • Operations on aggregations of procedures • Kepler (Gulf of Mexico hypoxia), NCSA Cyberintegrator (Texas drought) • Policy enforcement • Research Data Alliance policy sharing

  6. Use Cases • Demonstrate reproducible science. A use case could include the registration, storage, sharing, and re-execution of a workflow. The hypoxia use case from the Cross-Domain and Brokering Concept groups could be used as an example. • Automate data retrieval. A use case could demonstrate remote access to a data collection, retrieval of desired data sets, transformation, and use in an analysis workflow. An eco-hydrology example that automates access to digital elevation maps and land use coverage is being built. • Integrate community resources with collaboration environments. An example would be use of the DAB protocol to identify and cache local copies of relevant data sets for local analysis. • Integrate multiple community resources. A use case could be demonstration of invocation of multiple workflow systems within the same analysis. An example is the integration of Cyberintegrator workflow with collaboration environmentsto support drought prediction.

  7. Choose gauge or outlet (HIS) Eco-Hydrology RHESSys workflow to develop a nested watershed parameter file (worldfile) containing a nested ecogeomorphic object framework, and full, initial system state. Extract drainage area (NHDPlus) Digital Elevation Model (DEM) Slope Aspect Nested watershed structure Streams (NHD) Soil and vegetation parameter files Roads (DOT) Strata Patch Land Use NLCD (EPA) Hillslope Basin Leaf Area Index Landsat TM Stream network Phenology MODIS Flowtable Worldfile Soil Data USDA RHESSys

  8. iRODS Rule for RHESSys Modular workflow composed by chaining basic transformation Define input variables Call functions to apply each transformation step Store results in shared collection main { getExtentForGageReachcode(*gageReachcode, *extentInNHD_Vect_Coords); convertExtentToNHD_DEM(*extentInNHD_Vect_Coords, *extentInNHD_DEM_Coords); extractTileFromNHD_DEM(trimr(*extentInNHD_DEM_Coords, "\n")); importDEMTileIntoNewGRASSLocationAsUTM(*extentInNHD_Vect_Coords, *newLocPhysPath, *newLocObjPath); delineateWatershedForNHDGage(*nhdStreamGageID, *newLocPhysPath, *newLocObjPath); }

  9. extractTileFromNHD_DEM(*extentCoords) { # Split path to object into collection and name msiSplitPath(*nhdDEMObjPath, *nhdDEMObjColl, *nhdDEMObjName); writeLine("serverLog", *nhdDEMObjColl); writeLine("serverLog", *nhdDEMObjName); # Build query to discover physical path msiAddSelectFieldToGenQuery("DATA_PATH", "null", *genQInp); msiAddConditionToGenQuery("DATA_NAME", "=", *nhdDEMObjName, *genQInp); msiAddConditionToGenQuery("COLL_NAME", "=", *nhdDEMObjColl, *genQInp); msiAddConditionToGenQuery("DATA_RESC_NAME", "=", *rescName, *genQInp); # Run query msiExecGenQuery(*genQInp, *genQOut); # Extract path from query result foreach (*genQOut) {msiGetValByKey(*genQOut, "DATA_PATH", *filePath); } writeLine("serverLog", *filePath); # Determine physical path of input directory msiSplitPath(*filePath, *inFileDir, *headerFileIgnore); # Generate physical path of output file msiSplitPath(*inFileDir, *inFileParentDir, *rasterDatasetName) *tileFileName = "SUBSET-"++*rasterDatasetName++".img" *tileFilePath = *inFileParentDir++"/"++*tileFileName; # Generate iRODS path of output msiSplitPath(*nhdDEMObjColl, *nhdDEMObjCollParent, *junk) *tileObjPath = *nhdDEMObjCollParent++"/"++*tileFileName *args = "-of HFA -projwin "++*extentCoords++" "++"'*inFileDir'"++" "++"'*tileFilePath'"; writeLine("serverLog", *args); msiExecCmd("gdal_translate", *args, "iren.renci.org", "null", "null", *cmd_out); writeLine("serverLog", *cmd_out); # Register tile file with iRODS msiPhyPathReg(*tileObjPath, *rescName, *tileFilePath, "null", *status); }

  10. Event-Driven Real-Time Drought Analysis/Prediction Workflow Data Grid – Collaboration Environment Invoke Monitor Store Output RAPID (river routing model) Visualization Other data sources NASA NLDAS-2 http://rapid.ncsa.illinois.edu:8080/rapid/ NCSA Cyberintegrator

  11. Management of Workflows • Workflow components • File containing input parameters, input file names, output file names • Input files • File containing workflow language • Output files • Each invocation of the workflow generates versioned instance • Compare results across input file versions • Share workflows • Re-execute workflows • Automatically associates input parameters with each workflow invocation and with resulting output files

  12. Workflow Management Workflow file eCWkflow.mss Directory holding all input and output files associated with workflow file (mounted collection that is linked to the workflow file) /earthCube/eCWkflow eCWkflow.run Automatically generated run file for Executing each input file eCWkflow2.run Input parameter file, lists parameters and input and output file names eCWkflow.mpf eCWkflow2.mpf Directory holding all output files generated for invocation of eCWkflow.run, the version number is incremented /earthCube/eCWkflow/eCWkflow.runDir0 Output file created for eCWKflow.mpf Outfile /earthCube/eCWkflow/eCWkflow2.runDir0 Output file created for eCWKflow2.mpf Newfile

  13. Workflow Re-execution & Sharing eCWkflow.mss imcoll imcoll …. /earthCube/eCWkflow /hydrology/myWkflow myWkflow.run eCWkflow.run myWkflow.mpf eCWkflow.mpf /hydrology/myWkflow/myWkflow.runDir0 /earthCube/eCWkflow/eCWkflow.runDir0 Outfile Outfile /earthCube/eCWkflow/eCWkflow.runDir1 /hydrology/myWkflow/myWkflow.runDir1 Outfile Outfile

  14. DFC + DataONEInteroperability • Goal: support interoperability between a DFC data grid and DataONE • Task: Retrieve a file from DataONE, load into a DFC collaboration environment and add metadata

  15. How It Works • Query DataONECoordinating Nodes with SOLR query • Create iRODS collection with same name as query • Get list of identifiers for metadata files from search • Download the metadata file for each identifier • Store the metadata file in DFC data grid

  16. What the Demo Shows REST APIs Query: “rain” 1 Matching identifier list 2 Get metadata file for each identifier 3 Collection “rain” File goes into collection Mercury Web portal 4 file file file

  17. EarthCube Layered ArchitectureNSF EAGER 1239678DataNet Federation Consortium NSF OCI-0940841iRODS Policy-based data management NSF SDCI 1032732

More Related