1 / 15

The European DataGrid Project Technical status

The European DataGrid Project Technical status. www.eu-datagrid.org Bob Jones (CERN) Deputy Project Leader. DataGrid scientific applications Developing grid middleware to enable large-scale usage by scientific applications. Bio-informatics

meg
Download Presentation

The European DataGrid Project Technical status

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The European DataGrid ProjectTechnical status www.eu-datagrid.org Bob Jones (CERN) Deputy Project Leader

  2. DataGrid scientific applicationsDeveloping grid middleware to enable large-scale usage by scientific applications Bio-informatics • Data mining on genomic databases (exponential growth) • Indexing of medical databases (Tb/hospital/year) Earth Observation • about 100 Gbytes of data per day (ERS 1/2) • 500 Gbytes, for the ENVISAT mission Particle Physics • Simulate and reconstruct complex physics phenomena millions of times • LHC experiments will generate 6-8 PetaBytes/year

  3. The Project • 9.8 M Euros EU funding over 3 years • 90% for middleware and applications (HEP, Earth Obs. and Bio Med.) • Three year phased developments & demos (2001-2003) • Total of 21 partners • Research and Academic institutes as well as industrial companies • Extensions (time and funds) on the basis of first successful results: • DataTAG (2002-2003) www.datatag.org • CrossGrid (2002-2004) www.crossgrid.org • GridStart (2002-2004) www.gridstart.org

  4. Assistant Partners • Industrial Partners • Datamat (Italy) • IBM-UK (UK) • CS-SI (France) • Research and Academic Institutes • CESNET (Czech Republic) • Commissariat à l'énergie atomique (CEA) – France • Computer and Automation Research Institute,  Hungarian Academy of Sciences (MTA SZTAKI) • Consiglio Nazionale delle Ricerche (Italy) • Helsinki Institute of Physics – Finland • Institut de Fisica d'Altes Energies (IFAE) - Spain • Istituto Trentino di Cultura (IRST) – Italy • Konrad-Zuse-Zentrum für Informationstechnik Berlin - Germany • Royal Netherlands Meteorological Institute (KNMI) • Ruprecht-Karls-Universität Heidelberg - Germany • Stichting Academisch Rekencentrum Amsterdam (SARA) – Netherlands • Swedish Research Council - Sweden

  5. EDG structure : work packages • The EDG collaboration is structured in 12 Work Packages: • WP1: Work Load Management System • WP2: Data Management • WP3: Grid Monitoring / Grid Information Systems • WP4: Fabric Management • WP5: Storage Element • WP6: Testbed and demonstrators • WP7: Network Monitoring • WP8: High Energy Physics Applications • WP9: Earth Observation • WP10: Biology • WP11: Dissemination • WP12: Management } Applications

  6. Project Schedule • Project started on 1/1/2001 • TestBed 0 (early 2001) • International test bed 0 infrastructure deployed • Globus 1 only - no EDG middleware • TestBed 1 ( now ) • First release of EU DataGrid software to defined users within the project: • HEP experiments, Earth Observation, Biomedical applications • Project successfully reviewed by EU on March 1st 2002 • TestBed 2 (end 2002) • Builds on TestBed 1 to extend facilities of DataGrid • TestBed 3 (2nd half 2003) • Project completion expected by end 2003

  7. EDG middleware GRID architecture APPLICATIONS Local Computing Local Application Local Database Grid Grid Application Layer Data Management Metadata Management Job Management Collective Services Grid Scheduler Information & Monitoring Replica Manager Underlying Grid Services Computing Element Services Storage Element Services Replica Catalog Authorization Authentication and Accounting Service Index SQL Database Services M / W Grid Fabric services GLOBUS Fabric Monitoring and Fault Tolerance Node Installation & Management Fabric Storage Management Resource Management Configuration Management

  8. Local Application Local Database Application Developers Grid Application Layer Operating Systems System Managers Scientists Data Management Metadata Management Object to File Mapping CertificateAuthorities Job Management Collective Services Information & Monitoring Replica Manager Grid Scheduler FileSystems Underlying Grid Services Computing Element Services Storage Element Services Replica Catalog Authorization Authentication and Accounting Service Index SQL Database Services Fabric services Monitoring and Fault Tolerance Node Installation & Management Fabric Storage Management Resource Management Configuration Management UserAccounts BatchSystems PBS, LSF, etc. Storage Elements EDG interfaces MassStorage Systems HPSS, Castor Computing Elements

  9. EDG overview : current project status • EDG currently provides a set of middleware services • Job & Data Management • GRID & Network monitoring • Security, Authentication & Authorization tools • Fabric Management • Runs on Linux Red Hat 6.1 platform • Site install & config tools and set of common services available • 5 core EDG 1.2 sites currently belonging to the EDG-Testbed • CERN(CH), RAL(UK), NIKHEF(NL), CNAF(I), CC-Lyon(F), • also deployed on other testbed sites (~15) via CrossGrid, DataTAG and national grid projects • actively used by application groups • Intense middleware development continuously going on, concerning: • New features for job partitioning and check-pointing, billing and accounting • New tools for Data Management and Information Systems. • Integration of network monitoring information inside the brokering polices

  10. Dubna Moscow Lund Estec KNMI RAL Berlin IPSL Prague Paris Brno CERN Lyon Santander Milano Grenoble PD-LNL Torino Madrid Marseille BO-CNAF Pisa Lisboa Barcelona ESRIN Roma Valencia Catania Testbed Sites

  11. DAY1 Introduction to Grid computing and overview of the DataGrid project Security Testbed overview Job Submission lunch hands-on exercises: job submission Tutorials The tutorials are aimed at users wishing to "gridify" their applications using EDG software and are organized over 2 full consecutive days December 2 & 3 – Edinburgh 5 & 6 - Turin 9 & 10 - NIKHEF To date about 120 people trained http://hep-proj-grid-tutorials.web.cern.ch/hep-proj-grid-tutorials/ DAY2 • Data Management • LCFG, fabric mgmt & sw distribution & installation • Applications and Use cases • Future Directions lunch • hands-on exercises: data mgmt

  12. GriPhyN PPDG iVDGL Related Grid projects Through links with sister projects, there is the potential for a truely global scientific applications grid Demonstrated at IST2002 and SC2002 in November

  13. Details of release 1.3 Addresses bugs found by applications in EDG 1.2 - being deployed in November • Improved data management tools • GDMP v3.2 • automatic &explicit triggering of staging to MSS • support for parallel streams (configurable) • Edg-replica-manager v2.0 • uses GDMP for MSS staging • shorter alias’ for commands (e.g. edg-rm-l edg-replica-manager-listReplicas) • new file mgmt commands: getbestFile, cd, ls, cat etc. • support for parallel streams (configurable) • Better fabric mgmt • Bad RPMs no longer block installation • Based on Globus 2.0beta but with binary modifications taken from Globus 2.2 • large file transfers (gridFTP) • “lost” jobs (GASS cache) • unstable information system (MDS 2.2) • new Replica Catalog schema • More reliable job submission • Res Broker returns errors if overloaded • Stability tests successfully passed • Minor extensions to JDL Available on Linux RH 6.Not backward compatible with EDG 1.2

  14. Fix “show-stoppers” for application groups – mware WPs (continuous) Build EDG1.2.x with autobuild tools Improved (automatic) release testing Automatic installation & configuration procedure for pre-defined site Start autobuild server for RH 7.2 and attempt build of release 1.2.x Updated fabric mgmt tools Introduce prototypes in parallel to existing modules RLS R-GMA GLUE modified info providers/ consumers Storage Element v1.0 Introduce Reptor Add NetworkCost Function GridFTP server access to CASTOR Introduce VOMS Improved Res. Broker LCFGng for RH 7.2 Storage Element v2.0 Integrate mapcentre and R-GMA Storage Element V3.0 Incremental Steps for testbed 2 Prioritized list of improvements to be made to the current release as established with users from September through to end of 2002 Expect this list to be updated regularly

  15. Plans for the future • Further developments in 2003 • Further iterative improvements to middleware driven by users needs • More extensive testbeds providing more computing resources • Prepare EDG software for future migration to Open Grid Services Architecture • Interaction with LHC Computing grid (LCG) • LCG intends to make use of the DataGRID middleware • LCG is contributing to DataGRID • Testbed support and infrastructure • Get access to more computing resources in HEP computing centres • Testing and verification • Reinforce the testing group and maintain a certification testbed • Fabric management and middleware development • New EU project • Make plans to preserve current major asset of the project: probably the largest Grid development team in the world • EoI for FP6 ( www.cern.ch/egee-ei ), possible extension of the project, etc.

More Related