350 likes | 502 Views
Towards Service Oriented Geoscience SEE Grid and APAC Grid. Dr Robert Woodcock Executive Manager, e-Science. Outline. Industry drivers Inefficiencies in “geoscience” modelling workflow The Solid Earth and Environment Grid The APAC (Geoscience) Grid
E N D
Towards Service Oriented GeoscienceSEE Grid and APAC Grid Dr Robert Woodcock Executive Manager, e-Science
Outline • Industry drivers • Inefficiencies in “geoscience” modelling workflow • The Solid Earth and Environment Grid • The APAC (Geoscience) Grid • Putting it all together: pmd*CRC Modelling Workflow for Industry problems • Results and what might the future hold? 2
Australian National ResearchPriorities • Frontier Technologies for Building and Transforming Australian Industries: Stimulating the growth of world-class Australian industries using innovative technologies developed from cutting-edge research • Priority Goal 4: Smart information use Improved data management for existing and new business applications and creative applications for digital technologies • ICT applications are providing huge opportunities to deliver new systems, products, business solutions, and to make more efficient use of infrastructure • The ability of organisations to operate virtually and collaborate across huge distances in Australia and internationally hinges on our capabilities in this area 3
Key points from case studies and support letters • Show the diversity of use cases for the same data type throughout the mining value chain • Show a strong business case for interoperability for management of your data in the external world • Show an even stronger business case for interoperability for internal data management • Show why standards need to be developed by groups working together as part of a community • Highlight the emerging issue that responsibility of data quality becoming a legislative issue 4
Key Driver: Input to the Minerals Exploration Action Agenda – July 2003 • Industry input highlighted • problems in gaining access to pre-competitive geoscience information • described existing information as commonly incomplete and fragmented across eight government agencies, each with its own information management systems and structures • noted that the disparate systems lead to inefficiencies causing higher costs, reduced effectiveness and increased risk incurred by the industry and its service providers Source: http://www.industry.gov.au/assets/documents/itrinternet/minerals_aa_finalreport_July2003.pdf 5
strong strong Tensile failure very weak weak • What is the role of: • Competency contrasts? • Permeability? • Pore fluid pressure & flow fields? Block model of dilation: showing impact of Fault set “A” Dip variation mod. strong mod. strong mod. strong mod. strong Modelling Workflow • Define the geological problem • Build the model • Run the model • View and Interpret Results • Iterate to achieve Understanding • Report and feed into knowledge base …Must be repeatable, robust and timely 6
Inefficiencies in the Workflow • Information is scattered across: • Organisations – company, geological survey, etc • Resources – different hardware and software platforms • Geography – geological surveys in each state and territory (region) in Australia • Cost of data integration is high, in some situations exceeding all other costs • Computational resources: • Different architectures suit different numerical codes better • Are often available but outside your organisations direct control • Are setup in different ways • Cost of adapting an investigators specific toolkit to use multiple sites is often prohibitive Can these issues be removed? 7
The SEE Grid Community • Working together (loosely) to develop a toolkit for interoperability for the Solid Earth and Environmental Sciences • Together… because our information and services need to be shared more easily to achieve our goals • Loosely… because ultimately we are separated by political and economic boundaries • Toolkit… because our World is dynamic and we need tools that can be reconfigured and chained together quickly to answer our questions • …in this context we must reduce the barriers to becoming a part of the community 9
Client Pre-competitive geoscience data - The trouble is… Proprietary Software Versions of Software Data Structures 10 Slide courtesy of Stuart Girvan
Client Our aim… XML GML/XMML 11 Slide courtesy of Stuart Girvan
GA Reports Application WebMap Composer CLIENT APPLICATIONS Common Interface Binding – GML/XMML DOIR Web Feature Service (WFS) GA Web Feature Service (WFS) DATA ACCESS SERVICES Translation to standards here PIRSA Web Feature Service (WFS) Geoserver (Open Source) GA Geochemistry Feature Data Source DOIR Geochemistry Feature Data Source PIRSA Geochemistry Feature Data Source DATA SOURCES Little or no change required here Oracle PostGIS (Open Source) PostGIS (Open Source) 12
pmd*CRC Model Tools GA Reports Application CLIENTS WebMap Composer ? FracSIS Common Interface Binding – GML/XMML NRM WFS DOIR WFS GA WFS MRT WFS DATA SERVICES VICDPI WFS NSWDPI WFS PIRSA WFS NTGS WFS DATA SOURCES 13
The Solid Earth and Environment GridInformation - Implementation and Examples
Common Interface Binding - Details • Two parts • Service interface standard – how you communicate with the service, sending requests and receiving results • Information standards – how information is encoded in a community agreed form • We use and develop Open Geospatial Consortium and the Exploration and Mining Mark-up Language and its successor, GeosciML 15
Open Geospatial ConsortiumWeb Feature Service (WFS) Application (web based or desktop) Get Capabilities Request XML/ KVP Web Feature Service Get Capabilities Response XML Config Files Describe Feature Type Request XML/ KVP Describe Feature Type Response GML Schema XML/ KVP Data Source Get Feature Request Get Feature Response GML http protocol Response in Geography Mark-up Language (GML) - Or more usefully, a GML Application Schema 16
Features – Geoscience Community (XMML & GeoSciML) Borehole • collar location • shape • collar diameter • length • operator • logs • related observations • … Fault • shape • surface trace • displacement • age • … Basin? • formations • shape – time dependent • resource estimate • … Ore-body • commodity • deposit type • host formation • shape • resource estimate • … Observation • location • subject/specimen/station • property/theme • method • operator • date/time • result (+ type/reference system/scale/classification) • … 17
Data source to community schemas • Community schemas provide the common or shared model • All data providers have their own local data model • All data providers must map data from local source (database) to community schema, irrespective of technology implementation 18
Why XML? • Extensibility • Self describing • Ability to be (remotely) validated against schema • XML Schema provides “loose tolerances” • All software languages have tools to deal with XML • But… • Problematic for large data sets… • though nobody said you can’t use binary as well (even over WFS) Community agreement is what matters 20
How would you use an interoperable service? Rendered into a map layer AND queried by a user or…. A user makes a request and gets back GML based data which can be …. … formatted into a report or …. … read and used by any enabled application 21 Slides courtesy Stuart Girvan – Geoscience Australia
Web Map Interface (courtesy of Social Change Online) Bounding Box Known Layers 22
Tabular Reports by Source(courtesy of Geoscience Australia) 23
Why use simulation and modelling? • Mineral exploration has considerable uncertainty • We use simulation and modelling to analyse an ensemble of possible geological structures and histories that could have produced the observations seen today • The result is reduced uncertainty and some quantification of risk • This same approach applies to many fields – hazards, environment, … which is why we formed SEE Grid community 26
Our toolkit contains a variety of codes (usually more than one each type) for Mechanics Chemistry Transport Thermal Fluid flow Some of these can be coupled together: Reactive Transport – Chemistry+Transport+Thermal+Fluid flow Some scenarios only require a subset… It becomes very computationally intensive when using many… AND we run many scenarios Grid Computing provides a solution Darcy flow and Streamlines Our toolkit… 27
Mantle Convection Modelling Workflow Drill Core Analysis Workflow ReactiveTransport Workflow Tsunami Workflow Client Applications Community Agreed Service Interfaces and Information Models APAC Web Feature Service (WFS) Industry Web Feature Service (WFS) Gateway Services Geological Survey Web Feature Service (WFS) Industry Data and KnowledgeGrid Government Geological Surveys Data and Knowledge Grid APAC Data and Compute Grid Facilities 28
pmd*CRC SEE Grid APAC Grid Grid Technology Layers 29
The Grid Application… Service Interactions User Workflow... Client Edit Problem Description Run Simulation Job Monitor Archive Search Local Repository Login Authentication Resource Registry Job Management Service Data Management Service Community Infrastructure Information Computation Geology W.A Geochem W.A Escript Service FastfloRT Service HPC Repository Geochem N.S.W Geology S.A Physical Resource Physical Resource 30
Traditional Mechanical Modelling Workflow • Models (mesh + data files) are individually and laboriously constructed • The manual process is error prone • “Powerful” desktop computes several models at a time • Limitations are in the order of ~2 models per week • Results are manually visualised one at a time • Screenshots are manually taken and made into “movies” • Very little, if any, standardised data archiving is done. This results in potential confusion or loss of the originating conditions of the experiments, making it unrepeatable in the long term Slide courtesy of Robert Cheung and Warren Potma 31
New Refined Workflow Parameterised Geometry Creation Automated generation of visualisations Automated movie generation • Parameterised template or wizard driven model geometry/mesh creation • Boundary condition & model properties parameter sweep utilities • automatically creates a “family” of model, data files based on varying a set of parameters • Inversion algorithms • determine input parameters of future iterations automatically based on the user ranking of previous results Automated archiving 3D Time varying volume visualisation Multi-site data storage via Storage Resource Broker 32 Slide courtesy of Robert Cheung and Warren Potma
Results to Date • For one Investigator, on one investigation: • 500 Models in 4 months (100x more!) • Inversion/parameter sweep algorithms – semi-automated model creation; faster, less errors • Automated post-processing/visualisation – all views X all timescale X all models await the investigator automatically • Automated archiving – metadata searchable, more accurate store of experimental conditions, delivered to your store! 33
Results • Major inefficiencies have been removed by: • Integrating the pmd*CRC geoscience modelling workflow • with the: • Solid Earth and Environment Grid, and • APAC (Geoscience) Grid • Industry response to approach is supportive as evidenced by SEE Grid Roadshow survey results and pmd*CRC applications 34
Thank You Contact CSIRO Phone 1300 363 400 +61 3 9545 2176 Email enquiries@csiro.au Web www.csiro.au Name Dr Robert Woodcock Title Executive Manager, e-Science Phone +61 8 6436 8780 Email Robert.Woodcock@csiro.au Web www.csiro.au www.seegrid.csiro.au