1 / 19

Integrating OGSA-DAI to computational Grid workflows

Integrating OGSA-DAI to computational Grid workflows. Tamas Kukla, Tamas Kiss , Gabor Terstyanszky University of Westminster, UK Peter Kacsuk MTA SZTAKI, Hungary. Motivation I. What advantages does the integration of databases to workflow solutions provide?.

nero-cross
Download Presentation

Integrating OGSA-DAI to computational Grid workflows

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Integrating OGSA-DAI to computational Grid workflows Tamas Kukla,Tamas Kiss, Gabor Terstyanszky University of Westminster, UK Peter Kacsuk MTA SZTAKI, Hungary

  2. Motivation I.What advantages does the integration of databases to workflow solutions provide? • Several successfulGrid workflow systems (e.g. Taverna, Triana, Kepler, P-GRADE) • composition, orchestration and execution of computationally intensive processes • Limiteddata handling capabilities of Grid workflow solutions • Restricted mainly to file based data • No or very limited database access

  3. FS2 FS1 Motivation II. Workflow level interoperation of grid data resources Grid 1 Grid 2 Workflow engine DB1 J1 J3 J2 J4 DB2 J5 J: Job FS: File storage system, e.g. SRB or SRM DB: Database management system

  4. Why OGSA-DAI? • Open Grid Services Architecture Data Access and Integration project is concerned with constructing middleware to assist with access and integration of data from separate data sources via the grid. • An engineered extensible framework for data access and integration. • Expose heterogeneous data resources to a grid through web services. • Interaction with data resources: • Queries and updates. • Data transformation / compression • Data delivery. • Customise for your project using • Additional Activities • Client Toolkit APIs • Data Resource handlers • A base for higher-level services • federation, mining, visualisation,… Source: GGF 16, Feb 2006 by Neil Chue Hong

  5. OGSA DAI integration aspectsData staging Legend s: Data gathering request specification • Static:databases can be accessed before and after workflow execution, but they cannot be accessed at runtime • Semi-dynamic:data is accessed during workflow execution, but the parameters of the OGSA-DAI request are already specified before execution and cannot be generated at runtime • Dynamic: access the databases at runtime and the parameters of the request are also generated during workflow execution s e s e e: Data gathering request execution s: Data uploading request specification s s e e e: Data uploading request execution s e s e

  6. WF Management System Auxiliary tool WF editor WF engine Workflow composition Workflow execution OGSA DAI integration aspectsSubject of OGSA-DAI integration • Auxiliary tool:the workflow management system is extended with an auxiliary tool (typically a portlet) • Workflow editor:enables the workflow editor to be capable of communicating with databases exposed via OGSA-DAI services – data gathering during workflow authoring • Workflow engine: the workflow engine is enhanced to be able to execute the OGSA-DAI requests

  7. Port level representation:OGSA-DAI request is represented as either an input or an output port of a node data access OGSA-DAIservice OGSA-DAI port • Node level representation: request is represented as a workflow node that submits it to the OGSA-DAI service and receives the results data access OGSA-DAIservice OGSA-DAI node OGSA DAI integration aspectsRequest representation

  8. OGSA DAI integration aspectsSupported OGSA-DAI functionalities • Specific support: only a subset of OGSA-DAI functionalities are supported - higher level of usability, but restricted functionality • Generic support: full support for every OGSA-DAI functionality – could be more complex to use in specific use-cases Client integration level • Coupled:OGSA-DAI client becomes part of the workflow system • Decoupled: connection is provided via an interface through which the client can be invoked on the behalf of the system

  9. The targeted OGSA-DAI integration Static Data staging Semi-dynamic Dynamic WF Editor Subject of integration Auxiliary Tool WF Engine OGSA-DAI integration aspects Port Level Request representation Node Level Specific Functionality support General Coupled Client integration level Decoupled

  10. Implementation environment: P-GRADE Portal • Open source, general purpose, workflow-oriented computational Grid portal. Supports the development and execution of workflow-based Grid applications –a tool for Grid orchestration • Based on GridSphere-2 • Easy to expand with new portlets (e.g. application-specific portlets) • Easy to tailor to end-user needs • Developed by P-GRADE portal Alliance (lead by SZTAKI) • Grid services supported by the portal:

  11. What is a P-GRADE Portal workflow? • A directed acyclic graph where: • Nodes represent jobs - either sequential or parallel programs • Ports represent input/output files the jobs expect/produce • Arcs represent file transfer between the jobs • Integration at required integration level:allow the submission of a general/specific OGSA-DAI command line client application to the Grid as a P-GRADE workflow node

  12. How to submit the OGSA-DAI client to the Grid? • Direct submission is not feasible • Software dependencies • Complexity for the user • Requires an application repositoryintegrated to the workflow engine • GEMLCA: • An application repository extended with a job submitter • Open source – Globus incubator project • Deployment of a code in the GEMLCA repository means simply the creation of an XML-based description file (supported even from a portlet interface) • User can select previously deployed applications from the repository and run them with custom parameter values • GEMLCA is fully integrated to the P-GRADE workflow engine

  13. OGSA-DAI integration through GEMLCA OGSA-DAI node Workflow OGSA-DAIservice Computationalresources GEMLCArepository ... submit OGSA-DAI client ... Database OGSA-DAI client The solution is generic as any workflow engine can be made capable to communicate with the GEMLCA service (GT4 based Grid service) Set custom parameter values

  14. OGSA-DAI integration through GEMLCA • OGSA-DAI client applications supporting both OGSA-DAI 3.0 Axis (WSI) and GT (WSRF) deployed in the GEMLCA repository • Query client: to submit query statements to a given database exposed by an OGSA-DAI service • Update client: to submit update statements to a given database exposed by an OGSA-DAI service • Request document client: to execute general OGSA-DAI workflows represented as request documents (database query and update execution, data transfer, data transformation)

  15. Using the query client Selecting Grid Setting OGSA-DAI service URL Selecting deployed OGSA-DAI client Setting Database Resource ID Selecting computational site Setting query file Log file Results in CSV file

  16. An Application exampledeveloping a performance rating framework for UK hospitals - Health Care Modelling and Informatics Research Group UoW Executes the given OGSA-DAI query Generates sampler queries Analysis on the sample data Gathering results

  17. So this is what we have achived Data Transfer Level Interoperation in P-GRADE Grid infrastructure Portal server GridFTP servers LOCAL INPUT FILES User levelstorage LOCAL INPUT FILES SRB servers REMOTE INPUTFILES LOCAL OUTPUT FILES REMOTE OUTPUTFILES LOCAL OUTPUT FILES Computing resources Data manipulation Input to workflows Output from workflows Workflow level Interoperation of local, GridFTP, SRM and SRB file catalogues and databases exposed by OGSA-DAI Control of remote input/output OGSA-DAI services EGEE Storage elements

  18. How can the UK-e-Science community utilise the solution? • Deployed at production level in the NGS P-GRADE portal • portal URL: https://grid2-portal.cpc.wmin.ac.uk:8080 • Information page: http://ngs-portal.cpc.wmin.ac.uk • Please visit our next demonstration session on theNGS booth – Booth 13 Appleton tower • Wednesday 10-12

  19. Any questions? Thank you for your attention … Email: kisst@wmin.ac.uk Website: www.cpc.wmin.ac.uk

More Related