120 likes | 262 Views
Third Provenance Challenge University of Texas at El Paso Team’s Presentation. Team: Paulo Pinheiro da Silva, Nicholas Del Rio, Leonardo Salayandia Presenter: James Michaelis (RPI) http://trust.utep.edu. Overview. UTEP Approach: Process and Provenance Separation
E N D
Third Provenance ChallengeUniversity of Texas at El Paso Team’s Presentation Team: Paulo Pinheiro da Silva, Nicholas Del Rio, Leonardo Salayandia Presenter: James Michaelis (RPI) http://trust.utep.edu
Overview • UTEP Approach: Process and Provenance Separation • Process: Workflow-Driven Ontologies (WDO) and Semantic Abstract Workflow (SAW) • PC3 WDO and SAWs • Provenance: Proof Markup Language (PML) • PC3 PML • Capturing PC3 PML • Answering PC3 Questions • Conclusions
UTEP Approach • Different than OPM that considers process and provenance knowledge altogether, UTEP uses Inference Web technology that has an explicit separation between process and provenance knowledge • Inference Web work on provenance was originally developed in the context of theorem provers instead of scientific workflows • Inference Web has been expanded to include support for scientific workflows • Separation between process and provenance has been preserved (and is considered beneficial considering many provenance scenarios without process knowledge) • Process knowledge: Workflow-Driven Ontology (WDO) and Semantic Abstract Workflow (SAW) • Provenance knowledge: Proof Markup Language (PML)
WDOs and SAWs • WDOs are OWL-based ontologies used to represent process-related concepts, which are classified either as Data or Methods • WDO concepts can be created or reused from other domain ontologies as needed during the specification of processes • SAWs are built using instances of the WDO concepts connected through isInputTo and isOutputOf relations (and their inverses) • WDO-It! is a graphic editor for WDOs and SAWs
PC3 Semantic Abstract Workflow WDO Data instances WDO Method instances Data isOutputOf Method Data isInputTo Method PML-P Agent instances: Data comes from or goes to PML-P Agent Abstraction at multiple levels of detail
Proof Markup Language (PML) • PML is an OWL-based ontology composed of three modules: • PML-J (justifications): used to build information manipulation traces (or justifications) for a given response (or result) • PML-P (provenance): used to annotate PML-J documents with metadata about sources, methods (called inference rules), and agents • PML-T (trust): used to annotate PML-J with trust and belief metadata about agents and conclusions
PC3 PML Encoding OPM:Artifact <rdf:RDF> <NodeSet rdf:about="http://iw.utep.edu/pml/compactedDerbyDB_.owl#answer"> <hasConclusion> <pmlp:Information> <pmlp:hasURL rdf:datatype="http://www.w3.org/2001/XMLSchema#anyURI" > http://iw.cs.utep.edu/pc3/databases/J062941_LoadDB_022030949845896586 </pmlp:hasURL> <pmlp:hasFormat rdf:resource="http://iw.utep.edu/registry/FMT/derbyDB.owl#derbyDB"/> </pmlp:Information> </hasConclusion> <isConsequentOf> <InferenceStep> <hasInferenceEngine rdf:resource="http://iw.utep.edu/registry/IE/PC3-PSLoadExecutable.owl#PC3"/> <hasInferenceRule rdf:resource="http://iw.utep.edu/registry/RUL/compactDB.owl#compactDB"/> <hasIndex rdf:datatype="http://www.w3.org/2001/XMLSchema#int" >0</hasIndex> <hasAntecedentList> <NodeSetList> <ds:first rdf:resource="http://iw.utep.edu/pml/derbyDB_3.owl#answer"/> </NodeSetList> </hasAntecedentList> </InferenceStep> </isConsequentOf> </NodeSet> </rdf:RDF> OPM:WasGeneratedBy OPM:Process OPM:WasControlledBy
PML Capture • From a given SAW, WDO-It! has two options to generate code capable of capturing provenance: • Generate PML wrappers • used for run-time capture of provenance • Generate PML data annotators • used for post-execution generation of provenance
Answering PC3 Questions :What proc. steps were used? • SPARQL can be used to query the PML provenance graph. • This example shows how a SPARQL query could use the PML graph to answer what processing steps were used to generate some artifact.
Conclusion • The full encoding of the WDO, SAWs and PML for PC3 was done in 36 hours • UTEP’s approach relies on tools to: • Understand and speed-up the encoding of process knowledge (as WDOs and SAWs) • Use process knowledge to create PML wrappers and/or PML data annotators • Visualize and browse provenance • Use provenance for explanations, trust computation, data discovery, etc.
Acknowledgements • UTEP would like to thank James Michaelis for his effort to understand our work and represent our team at the 3rd Provenance Challenge • UTEP would like to thank the 3rd Provenance Challenge organizers and Paul Groth in particular for creating an opportunity for our team to be represented at the event