230 likes | 415 Views
The EGEE project and relations with China Bob Jones – EGEE Project Director. LCG: The LHC Computing Grid Project Ian Bird – Grid Deployment Group Leader. eScience. Science is becoming increasingly digital , needs to deal with increasing amounts of data and computational needs
E N D
The EGEE project and relations with ChinaBob Jones – EGEE Project Director LCG: The LHC Computing Grid ProjectIan Bird – Grid Deployment Group Leader
eScience • Science is becoming increasingly digital, needs to deal with increasing amounts of data and computational needs • Simulations get ever more detailed • Nanotechnology –design of new materials from the molecular scale • Modelling and predicting complex systems (weather forecasting, river floods, earthquake) • Decoding the human genome • Experimental Science uses ever moresophisticated sensors to make precisemeasurements • Need high statistics • Huge amounts of data • Serves user communities around the world CERN, 2 November 2007
GRID . INFRASTRUCTURE KNOWLEDGE . INFRASTRUCTURE NETWORK . INFRASTRUCTURE How e-Infrastructrures help e-Science • e-Infrastructures provide easier access for • Small research groups • Scientists from many different fields • Remote and still developing countries • To new technologies • Produce and store massive amounts of data • Transparent access to millions of files across different administrative domains • Low cost access to resources • Mobilise large amounts of CPU & storage on short notice (PC clusters) • High-end facilities (supercomputers) • And help to find new ways to collaborate • Develops applications using distributedcomplex workflows • Eases distributed collaborations • Provides new ways of community building • Gives easier access to higher education CERN, 2 November 2007
The LHC Accelerator The accelerator generates 40 million particle collisions (events) every second at the centre of each of the four experiments’ detectors
Which are recorded on disk and magnetic tapeat 100-1,000 MegaBytes/sec ~15 PetaBytes per year for all four experiments LHC DATA This is reduced by online computers that filter out a few hundred “good” events per sec.
CMS LHCb ATLAS ALICE Resources for LHC Data Handling 100,000 of today’s fastest processors 15 PetaBytes of new data each year 150 times the total content of the Webeach year 1 Petabyte (1PB) = 1000TB = 10 times the text content of the World Wide Web** ** Urs Hölzle, VP Operations at Google
The Worldwide LHC Computing Grid • Purpose • Develop, build and maintain a distributed computing environment for the storage and analysis of data from the four LHC experiments • Ensure the computing service • … and common application libraries and tools • Phase I – 2002-05 - Development & planning • Phase II – 2006-2008 – Deployment & commissioning of the initial services
WLCG Collaboration • The Collaboration • 4 LHC experiments • ~140 computing centres • 12 large centres (Tier-0, Tier-1) • 38 federations of smaller “Tier-2” centres • ~35 countries • Resources • Contributed by the countries participating in the experiments • Commitment made each October for the coming year • 5-year forward look
Canada – Triumf (Vancouver) France – IN2P3 (Lyon) Germany – Forschunszentrum Karlsruhe Italy – CNAF (Bologna) Netherlands – NIKHEF/SARA (Amsterdam) Nordic countries – distributed Tier-1 Spain – PIC (Barcelona) Taipei – Academia SInica UK – CLRC (Oxford) US – FermiLab (Illinois) – Brookhaven (NY) LCG Service Hierarchy Tier-0 – the accelerator centre • Data acquisition & initial processing • Long-term data curation • Distribution of data Tier-1 centres Tier-1 – “online” to the data acquisition process high availability • Managed Mass Storage – grid-enabled data service • Data-heavy analysis • National, regional support Tier-2 – ~130 centres in ~35 countries • End-user (physicist, research group) analysis – where the discoveries are made • Simulation
CPU Usage accounted to LHC Experiments accounted/pledged over past 12 months : 62% 48% : 6 X 4 X Ramp-up needed over next 8 months
CPU Usage accounted to LHC Experiments July 2007 80 Tier-2s 45% 11 Tier-1s 35% CERN 20% 530M SI2K-days/month (CPU) 9 PB disk at CERN + Tier-1s
2007 – CERN Tier-1 Data Distribution Need a factor of 2-3 when the accelerator is running (achieved last year for basic file transfers – but this year tests are under more realistic experiment conditions) Jan Feb Mar Apr May Jun Jul Aug Average data rate per day by experiment (Mbytes/sec)
YBJ LCG in China • China involved in all 4 LHC experiments • 4 Grid sites active in LCG/EGEE/EUChinaGrid • IHEP: Institute of High Energy Physics – Beijing • Shandong University • CNIC: Computer Network Information Centre - Beijing • Peking University • Basic infrastructure • Chinese Certification Authority; accredited in APGridPMA
LCG depends on two major science grid infrastructures …. EGEE - Enabling Grids for E-Science OSG - US Open Science Grid
Worldwide grid for science • ~250 sites in 50 countries – some very big, some very small • >100 Virtual Organisations with >45 000 CPUs
EGEE Main Objectives Operate a large-scale, production quality grid infrastructure for e-Science Attract new resources and users from industry as wellas sciences • Flagship grid infrastructure project co-funded by the European Commission • Now in 2nd phase with 91 partners in 32 countries CERN, 2 November 2007
Infrastructures geographical or thematic coverage Support Actions key complementary functions Applications improved services for academia, industry and the public Registered Collaborating Projects 25 projects have registered as of Sept 2007:web page CERN, 2 November 2007
Support the interconnection and interoperability of Grids between Europe and China • EUChinaGRID is a Specific Support Action (SSA) funded under the EU VI Framework Program, lead by INFN (Italy) • 10 partners (6 from Europe and 4 from China). • Duration of 2 years from January 2006. • www.euchinagrid.eu. Project Information
A pilot infrastructure using gLite is already working (11 sites) • A gateway between gLite (EGEE) and GOS (CNGrid) has been implemented • IPv6 middleware compatibility study was done and results are available (web site, code checker) • Applications are running: LHC, WISDOM, Late/Early Stage, ARGO Data Mover, Rosetta and others Main Achievements
Institutions Application Beijing Genomics Institute Cancer Genome Project Beijing University of Technology Aseismic Engineering Drug Discovery Beihang University Fluent Physics and Chemistry Properties Resource Computation Computer Network Information Center, Chinese Academy of Statistical Analysis of Fe Abundances Gradients in Sciences / National Astronomical the Galaxy Observatories, Chinese Academy of Sciences Department of Astronomy, Cosmological parameter constraints with the Peking University combination of different probes. JUMC, Kraków, Poland MR result processing GRNet/CERTH COGENT++ Large Eddy Simulation, Multiscale Methods and Lattice Boltzmann Methods for Engineering Peking University Applications Other Applications
WISDOM Malaria Data Challenge ~13 CPU years in about 2 months 100% = 420 CPU years 3%
EGEE working with collaborating infrastructure projects CERN, 2 November 2007
Summary • Grids are all about sharing – they are a means of working with groups around the world • Today we have a window of opportunity to move grids from research prototypes to permanent production systems (as networks did a few years ago) • Interoperability is key to providing the level of support required for our user communities • EUChinaGrid project is working in this direction • EGEE operates the world’s largest multi-disciplinary grid infrastructure for scientific research • In constant and significant production use • A third phase of EGEE is under preparation • We have a fully operational grid service for LHC – built on the underlying EGEE and OSG infrastructures • and is in the final stages of preparation for the LHC start-up in 2008 www.eu-egee.org cern.ch/lcg CERN, 2 November 2007