1 / 37

Compute Grids, Data Grids and Service Grids

Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-Science Director of the UK Grid Operations Centre. Compute Grids, Data Grids and Service Grids - What they are - What they can do - Where they can be found - What the future holds in this arena.

mateja
Download Presentation

Compute Grids, Data Grids and Service Grids

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Compute Grids, Data Grids and Service Grids Dr Neil Geddes CCLRC Head of e-ScienceDirector of the UK Grid Operations Centre

  2. Compute Grids, Data Grids and Service Grids - What they are - What they can do - Where they can be found - What the future holds in this arena

  3. Compute Grids, Data Grids and Service Grids What are they ?

  4. What is a computational grid? • A pool of computational resources that can be “plugged into” via standard interfaces. • Processors • Data storage devices • Instruments

  5. Compute Grids • Focus on high throughput computing • Clusters of computers • Some very big • Clusters of clusters • HPC meta-computing • HPC + pre + post processing • Grids enable coordination across administrative boundaries • Key components: • Authentication, Authorisation • Resource discovery • Job submission/retrieval • Networking NASA Information Power Grid

  6. Data Grids • Focus on • Large data volumes • Coordinated data access • Heterogeneous and distributed data • Importance of metadata • e.g. • Virtual Observatories • Medical images • Important components • Authentication, Authorisation • Resource discovery • Data transfer • Confidentiality • Networking X-ray optical infra-red radio

  7. Service Grids • Focus on • Everything else: • What you want to do rather than how it is done • Integrate audio visual tools • Remote control and tele-presence • Microscopes, Beamlines, test equipment • Integrated with compute and data grid • Integrate with other services • Journal archives, website management • Service based architectures • Web services • Important components • Authentication, Authorisation • Resource discovery • Data transfer • Confidentiality • Common Interfaces

  8. Common Grid Features • Authentication • Authorisation • Accounting • Resource discovery • Data transfer • Confidentiality • Security • Automation Different emphasis for different deployments/problems Grid computing is about common standards/interfaces to enable inter-enterprise, collaborative computing.

  9. Compute Grids, Data Grids and Service Grids What can they do ? Where can they be found ?

  10. (some) US Grid Projects: • Information Power Grid (IPG) Production Grid for aerosciences and other NASA missions. • Network for Earthquake Eng. Simulation Grid (NEESGrid) Production Grid for earthquake engineering. • National Virtual Observatory (NVO) Production Grids for data analysis in astronomy. • Particle Physics Data Grid (PPDG) Production Grids for data analysis in high energy and nuclear physics • Southern California Earthquake Center 2 Full geophysics modeling using Grids and knowledge-based systems. • TeraGrid U.S. science infrastructure linking four major resource sites at 40 Gb/s. • DOE Science Grid (DOESG) supplies persistent Grid services. • EdGrid promote applications of modeling and visualization in science and mathematics education, remote control of instruments (electron microscope) for K-12 • Biomedical Informatics Research Network (BIRN) An NCRR initiative aimed at creating a testbed to address biomedical researchers' need to access and analyze data at a variety of levels of aggregation located at diverse sites throughout the country.

  11. UK eScience Projects  CLEF A Co-operative Clinical e-Science Framework  BiosimGRID A GRID Database for biomolecular simulations  e-HPTX An e-Science resource for High Throughput Protein Crystallography  AstroGrid A Virtual Observatory for the UK  BAIR Biological Atlas of Insulin Resistance ClimatePrediction.com  Distributed computing for a global climate (NERC Pilot) DAME  Distributed Aircraft Maintenance Environment

  12.  e-Protein A distributed pipeline for structural-based proteome annotation using GRID technology e-Minerals  Environment from the molecular level: an e-Science proposal for modelling the atomistic processes involved in environmental issues. Integrative Biology  A robust and fault tolerant Grid infrastructure fro biomedical science GENIE Grid Enabled Integrated Earth system model GEODISE  Grid Enabled Optimisation & Design Search for Engineering myGrid   Directly Supporting the E-Scientist Comb-e-Chem Structure-Property Mapping: Combination Chemistry & the Grid NERC DataGrid   Data discovery and delivery for the NERC community GridPP  The Grid for UK Particle Physics

  13. EU funded Grid Projects

  14. CMS LHCb ATLAS CMS e-science and the UK GRID

  15. LHC Computing Grid Project

  16. climateprediction.net • Launch ensemble of coupled simulations of 1950-2000 and compare with observations. • Largest climate model ensemble ever (by factor of >200) • >45,000 users, >15,000 complete model runs, >1,000,000 model years in ~3 months (this is equivalent to 1.5 Earth Simulators) • Screensaver” requires • 10 CPU days on a 1.4GHz P4,>128MB memory, 600MB disk space • Global outreach (participants in all 7 continents, inc. Antarctica!) • Generated much interest in schools (coolkidsforacoolclimate.com)

  17. http://www.nbirn.net

  18. What is BIRN? • Testbed for a biomedical knowledge infrastructure • Creation and support federated bioscience databases • Data integration • Interoperable analysis tools • Datamining software • Scalable and extensible • Driven by research needs pull, not technology push

  19. BIRN Today • Established three neuroscience testbeds building on previously funded R01 research projects: - Mouse BIRN - Morph BIRN - Functional BIRN - BIRN Coordinating Center • Integrating the activities of the advanced biomedical imaging and clinical research centers in the US. • Developing hardware and software infrastructure for managing distributed data: creation of data grids. • Exploring data using “intelligent” query engines that can make inferences upon locating “interesting” data. • Building bridges across tools and data formats. • Changing the use pattern for research data fromthe individual laboratory/project to shared use.

  20. BIRN Network IT Infrastructure to hasten the derivation of new understanding and treatment of disease through use of distributed knowledge

  21. AboutNEESgrid will link earthquake researchers across the U.S. with leading-edge computing resources and research equipment, allowing collaborative teams (including remote participants) to plan, perform, and publish their experiments. • Through the NEESgrid, researchers will: • perform tele-observation and tele-operation of experiments; • publish to and make use of a curated data repository using standardized markup; • access computational resources and open-source analytical tools; • access collaborative tools for experiment planning, execution, analysis, and publication. • The components of the NEESgrid system will be completed by September, 2004, • when management and operation of the NEES system will be turned over to a • consortium of earthquake engineer researchers and practitioners.

  22. Generic Experiment in Progress (an instance or “test”)

  23. Compute Grids, Data Grids and Service Grids What the future holds ?

  24. The drive toward standardisation community-initiated forum of thousands of individuals from industry and research leading the global standardization effort for grid computing.  GGF's primary objectives are to promote and support the development, deployment, and implementation of Grid technologies and applications via the creation and documentation of "best practices" - technical specifications, user experiences, and implementation guidelines. OASIS is a not-for-profit, global consortium that drives the development, convergence and adoption of e-business standards • Horizontal and e-business framework • Web Services • Security • Public Sector • Vertical industry applications • WS-RF (from GGF)

  25. for Everyone Enabling Grids for E-science in Europe

  26. EGEE - Consortia UK e-Science: PPARC + Core Programme 10 European Consortia (incl. GEANT/TERENA/DANTE) + US + Russia

  27. Oxford and Leeds (White Rose Grid)

  28. Manchester and CCLRC-RAL

  29. Also includes: • http://www.csar.cfs.ac.uk/ • 256 Itanium2 processor SGI Altix • 512 processor Origin3800 http://www.hpcx.ac.uk/ Full installation = 1600 IBM p690+ Regatta processors currently 1236 processors EMBL Nucleotide Sequences NCBI, BLAST, EMBOSS, FASTA, Gaussian • Thus, the NGS provides access to over 2000 processors, over 36TB of "data-grid" • capacity, common scientific applications and extensive data archives. • Other resource providers anticipated to join in the future …

  30. More than just computation and data resources… • In future will include services to facilitate collaborative (grid) computing • Authentication (PKI X509) • Job submission/batch service • Authorisation • Certificate management • Virtual Organisation management • Data access/integration services (SRB/OGSA-DAI/DQPS) • Information service • National Registry (of registry’s) • Data replication • Data caching • Grid monitoring • Accounting

  31. Concluding Remarks • Huge worldwide research activity • Push towards standardisation and intersection with e-Business (web services) • Increasing grid infrastructure deployed ‘[The Grid] intends to make access to computing power, scientific data repositories and experimental facilities as easy as the Web makes access to information.’Tony Blair, 2002

  32. The End

  33. Response of Atlantic circulation to freshwater forcing

  34. The Particle Physics Challenge CMS ATLAS Storage – Raw recording rate 0.1 – 1 GByte/sec Accumulating at ~10 PetaBytes/year 10 PetaBytes of disk Processing – >100,000 of today’s fastest PCs LHCb

  35. CERN/LHC Community Europe: 267 institutes, 4603 usersElsewhere: 208 institutes, 1632 users

More Related