110 likes | 202 Views
Project Graphic and/or Photo. netKarma Spiral 2 Year-end Project Review. Indiana University Beth Plale ( PI) School of Informatics and Computing and Chris Small GMOC Staff : Robert Ping Students : Devarshi Ghoshal 08/27/2010. Project Summary.
E N D
Project Graphic and/or Photo netKarmaSpiral 2 Year-end Project Review Indiana University Beth Plale (PI) School of Informatics and Computing and Chris Small GMOC Staff: Robert PingStudents: DevarshiGhoshal 08/27/2010
Project Summary • The project collects and represents provenance for experiments conducted on the GENI platform. The provenance of an experiment is relevant information collected during execution at experiment plane, control plane, and measurement plane. The provenance results will be available in a GENI Provenance Registry (netKarma) but can also be used to augment other collection mechanisms, for instance at the instrument level. • captures and stores process and data provenance. • based on generation of discrete provenance activities during experiment lifecycle • Activities aggregated to form complex data and process provenance graphs. • NetKarmais based on Karma, a provenance collection and representation service that has been used to collect provenance in diverse applications including satellite imagery pipeline (NASA funded), Linked Environments for Atmospheric Discovery (LEAD, NSF funded), and the Life Science Grid (Lilly Corp. funded). August 27, 2010
Milestone Status August 27, 2010
QSR Status August 27, 2010
Accomplishments 1: Advancing GENI Spiral 2 Goals • Continuous Experimentation: We plan to have Karma available as a persistent service/repository. As such it will be positioned to continuously collect provenance from experiment sources and allow queries on the data through a web interface. • Integration: have integrated netKarma with GUSH; netKarma collects provenance from GUSH logfile. Demonstrated at GEC7. Have plan in place on how to instrument Raven and what provenance information to collect; developed through conversations with Raven team. • Instrumentation and Measurement: are engaged in I&M working group, have worked with GMOC team to identify operational data to collect on control frameworks and their status and configuration. Have begun discussion with LAMP and On-Time Measure ways in which measurement traces can be tied to the provenance record and vice versa. August 27, 2010
Accomplishments 1: Advancing GENI Sprial 2 Goals Integration with Raven Raven provides a package repository used by experimenters to distribute software to GENI Nodes (currently PlanetLab). NetKarma is working on capturing information from Raven including: • Reference to the Software location stored in the Raven Repository • Status of the software distribution • Version number • Was the package distributed properly in the last run NetKarma provides a mechanism for collecting provenance information from Raven. This could allow experimenters to: • Reproduce experiments with links to software used in previous runs • Allow publication of a reference of an experiment with references to the exact software used • Allow experimenters to understand the exact conditions of the experiment by providing a record of software used to generate the data generated by the experiment • Allow integration between I&M data collection data and provenance information available in experimental tools such as Raven August 27th 2010
Accomplishments 1: Advancing GENI Spiral 2 Goals Raven Data Collection Our approach for GUSH is to mine the GUSH log file using an adapter that then identifies information that can be used as part of the provenance record. Unlike GUSH, information is not stored in the slice in log files, but resides in the Raven repository. We talked with Scott Baker with the Raven team about this, and he has agreed to modify the Raven tool to write this information into the slice log files. Scott has also agreed to us instrumenting the Raven tool. August 27th 2010
Accomplishments 1:Advancing GENI Spiral 2 Goals GUSH is a client tool for executing experiments in a PlanetLab slice • Provenance adaptor : an interface that uses experiment logs and a set of rules to derive provenance events that are sent to the Karma service for storage and derivation of provenance. • Adaptor is a generic log processing unit for GENI component log files which comprise of two sub-units: Log Parser and Notification Generator August 27, 2010
Accomplishments 2:Other Project Accomplishments • Built first version Adaptor collection mechanism • Working on Integrating support for RabbitMQ into architecture • Implements AMQP • Expand relational database schema to handle layers of provenance (corresponding to layers of GENI infrastructure) • Extend Existing Architecture • Add support for RabbitMQ AMQP standard compliant open source publish subscribe system August 27, 2010
Issues • A key piece of provenance is the identify of the human experimenter. This information for the PlanetLab framework currently resides in the PlanetLab. • Current lack of measurement traces at the experimental and slice level of granularity. This level of trace data enhances the understanding for experimentalists and could be easily tied to the provenance record (or vice versa). August 27, 2010
Plans • What are you plans for the remainder of Spiral 2? • Provenance collection of Raven • Identifying the experimenter • Tying experiment provenance to measurement traces • Continuously operational service of provenance server • The GPO is starting to formulate goals for Spiral 3. What are your thoughts regarding potential Spiral 3 work? • The GENI platform is a rich experimental environment. Information collection such as through provenance can extend the usefulness of the platform as an experimental platform and as a learning tool. • Uniform information collection should occur and be an integral part of the mesoscale efforts • Measurement traces are valuable, and can be made more valuable when augmented with provenance collected about experimental, operation, and control activities. • measurement traces should be augmented with provenance information • We see ourselves contributing to both initiatives over next 2 years. August 27, 2010