1 / 19

Paul Groth, Simon Miles, Luc Moreau

Paul Groth, Simon Miles, Luc Moreau. Outline. Process Documentation for Provenance Power of the P-Structure P-assertion Recording Protocol PReServ’s Functionality Performance Pitch . Provenance. The Provenance Question Lots of definitions… Boil it down to a question.

chione
Download Presentation

Paul Groth, Simon Miles, Luc Moreau

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Paul Groth, Simon Miles, Luc Moreau UK e-Science All Hands Meeting 2005

  2. Outline • Process Documentation for Provenance • Power of the P-Structure • P-assertion Recording Protocol • PReServ’s Functionality • Performance • Pitch  UK e-Science All Hands Meeting 2005

  3. Provenance • The Provenance Question • Lots of definitions… • Boil it down to a question. • What is the process that led to a particular result? • How do we answer this question? • Search through documentation. UK e-Science All Hands Meeting 2005

  4. Documentation • Process Documentation • encompasses all other documentation • SOA based model of process • Actors communicate via message passing • Actors make ASSERTIONS to document process. Termed p-assertions. • How to organise these p-assertions UK e-Science All Hands Meeting 2005

  5. P-Structure UK e-Science All Hands Meeting 2005

  6. P-Structure View UK e-Science All Hands Meeting 2005

  7. Benefits • Domain independent queries • That are provenance specific • P-structure is a shared logical organisation of p-assertions • Does not prescribe how p-assertions are exactly stored in an implementation. UK e-Science All Hands Meeting 2005

  8. PReP • Introduces the Provenance Store • A Separate entity for maintaining process documentation • PReP specifies how an actor can communicate with the Provenance Store. • PReP has a number of nice properties. • Statelessness • Idempotence • Terminiation UK e-Science All Hands Meeting 2005

  9. An Implementation • What is PReServ? • A Web Services implementation of a Provenance Store • Implements • PReP for recording • XQuery for querying • Provides libraries and wrappers for making applications provenance aware. UK e-Science All Hands Meeting 2005

  10. PReServ Implementation Diagram WS Client Web Service Axis Handler Axis Handler PS Client Side Library PS Client Side Library Backend Store Interface PS Client Side Library In-Memory Store Database Store … Query Actor WS Backend Stores Provenance Store WS Calls Java Calls UK e-Science All Hands Meeting 2005

  11. Implementation cont. • Caching mechanism to improve performance • Berkeley Java Database 2.0 • No setup required • Completely Transactional SOAP Msg SOAP Msg Dispatcher Store Plug In Query Plug In … Backend Store Interface Java Object Database Memory … UK e-Science All Hands Meeting 2005

  12. Requirements • Apache Tomcat 5.0 • Apache Ant 1.6.2 • Java 1.5 (1.4 supported with some help) • Pure Java, tested on • Windows • Mac OS X • Debian Linux UK e-Science All Hands Meeting 2005

  13. Evaluation Deployment • Protein Compressibility Experiment • HPDC’05 • Workflow runs under VMWare • deployment consistency • ease of development • Workflow is executed on one machine • PReServ runs on another machine • Version 0.1.5 of PReServ UK e-Science All Hands Meeting 2005

  14. Record Performance UK e-Science All Hands Meeting 2005

  15. Query Performance UK e-Science All Hands Meeting 2005

  16. Applications UK e-Science All Hands Meeting 2005

  17. Conclusion • The p-structure allows for domain independent, provenance specific queries using XQuery. • Both recording and query times are linear • PReServ has a extensible architecture allowing for further functionality to be easily added. UK e-Science All Hands Meeting 2005

  18. Download! • Try it out! • Download PReServ 0.2: • The AHM release  • Released under Open Source MIT License • www.pasoa.org • Click software • Contact us, we will try to help you make your application provenance-aware. UK e-Science All Hands Meeting 2005

  19. Configuration • Redhat Linux 9.1 on VMWare on Windows XP • Pentium P4 2.8 GHZ 1.5 GB RAM • PReServ on another machine • Database backend Berkley JDB • 100 Mb local ethernet UK e-Science All Hands Meeting 2005

More Related