260 likes | 278 Views
Discusses high-level LHC and Grid plans for 2005, challenges, gLite progress, and collaborative efforts for enabling discovery. Reviews 2005 metrics, quality assurance, and the roadmap for the future.
E N D
GridPP Overview Tony Doyle Collaboration Meeting
Outline • High level LHC, Expt. and Grid Plans.. • 2005 Outturn and the Goldilocks Problem.. • “Some of the challenges for next year” • gLite is not too late.. • Pulling together on the good ship Grid • Coming together to enable a discovery • Plans and Resolutions (from the year of the rooster to the year of the dog..) Collaboration Meeting
A.When Cometh the LHC? “Main objectives are to terminate installation in February 2007 and enable first collisions in summer 2007” Lyn Evans Collaboration Meeting
B.When Cometh the Detectors? e.g. ATLAS “with good will and great efforts from everybody we can be confident that the Technical Coordination Team will manage to have ATLAS installed by June 2007” Collaboration Meeting
C. When Cometh the Grid? • Service Challenges – UK deployment plans • End point April ’07 • Context: The virtual LHC Computing Centre Collaboration Meeting
Grid Overview • Aim: by 2008 (full year’s data taking) • CPU ~100MSi2k (100,000 CPUs) • Storage ~80PB • - Involving >100 institutes worldwide • Build on complex middleware being developed in advanced Grid technology projects, both in Europe (Glite) and in the USA (VDT) • Prototype went live in September 2003 in 12 countries • Extensively tested by the LHC experiments in September 2004 • 197 sites, 13,797 CPUs, 5PB storage in September 2005 Collaboration Meeting
2005 Metrics and Quality Assurance: wider concerns Collaboration Meeting
2005 Tier-1 GOC Accounting Collaboration Meeting
2005 Grid and Non-Grid Tier-1/A CPU Use Grid Fraction ~50% Collaboration Meeting
2005 Grid and Non-Grid Tier-1/A CPU Use OC user Users! 70% “Target” OC! Collaboration Meeting
2005 Tier-1 Efficiency overall efficiency = S (CPU times) / S (wall times) Jan ’05 Pre-Grid Efficiency High Apr ’05 Remote data access problems Dec ’05 General Improvements Mixed view from “VOs” Need to test data access I/O-bound jobs possible Good to have this data! Collaboration Meeting
Some of the challenges for next year See Jamie’s talk • Castor 2 • Good progress, rapid reaction to problems • But some way still to go with testing - stressing reliability, performance • Can only be done with participation of experiments • Distribution to other sites being planned • Distributed database services • Architecture and plan agreed now • But still to deploy pilot services - timing is worryingly tight • End-to-end testing of the DAQ-T0-T1 chain • recording, calibration and alignment, reconstruction, distribution • Full Tier-1 work load testing – • Recording, reprocessing, ESD distribution, analysis, Tier-2 support • Understanding the CERN Analysis Facility • batch analysis • interactive analysis • Startup scenarios • Schedule may be better known after next spring’sChamonix meeting
High Level View • from recent LHCC review is ~ • Service Challenges - OK (established) • Throughput -not OK (not sufficiently tested) • Baseline services – OK (defined, not completely established) • Practical steps – OK (we need to improve communication) • MoUs – OK (we need to sign off, covered last meting) • Concerns: • Significant delay in gLite, Castor2, distributed data management, database servicesall are late • Middleware and experiment connections aretoo weak • Analysis modelsare untested Collaboration Meeting
gLite Stack www.glite.org • 15 Baseline Services for a functional Grid • We rely upon gLite components • This middleware builds upon VDT (Globus and Condor) and meets the requirements of all the basic scientific use cases: • Green (amber) areas are (almost) agreed as part of the shared generic middleware stack by each of the application areas • Red are areas where generic middleware competes with application-specific software. Collaboration Meeting
gLite Pack Collaboration Meeting
Middleware Re-engineering www.glite.org • A series of gLite releases have been produced (1.1, 1.2, 1.3, and 1.4) • Driven by application and deployment needs • Focus on defect fixing • gLite deployed on a Pre-Production Service and made available for application use • Independent evaluation by NGS • gLite components also available via VDT (US) • gLite components deployed on the infrastructure • Emphasis is now on release of gLite 1.5 • Will continue… see Steve Fisher’s talk • EGEE phase 2 starts in April 2006 Collaboration Meeting
Some of the challenges for next year • File transfers • Good initial progress (except dCache->DPM- currently) • But some way still to go with testing - stressing reliability, performance • Can only be done with participation of experiments • Distribution to other sites being planned • Distributed VO services • Plan agreed – T1 will sign off and then VO boxes may be deployed by T2s • But still to deploy pilot services - CMS (OK) LHCb (OK) ATLAS ALICE • End-to-end testing of the T0-T1-T2 chain • MC production, reconstruction, distribution • Full Tier-1 work load testing • Recording, reprocessing, ESD distribution, analysis, Tier-2 support • Understanding the “Analysis Facility” • batch analysis @ T1 and T2 • interactive analysis • Startup scenarios • Schedule is known at high level and defined for Service Challenges – testing time ahead (in many ways) Collaboration Meeting
Themes for 2006 Think how you can help by either: • measuring throughput for experiments; • testing gLite, providing feedback; • working practically with experiments; • running your analysis on the Grid; • helping Grid adoption (in other fields). OR • combine elements of 1-5. Emphasis on end-to-end (vertical) integration. Collaboration Meeting
Pulling together with the experiments? • Hopefully the effort in pulling the Grid boat out is more equal.. • However many discoveries made in Grid circles are currently being re-discovered by experiment users • Succinct user documentation will help Collaboration Meeting
Pulling together with the experiments? • There are currently not enough users in the Grid School for the Gifted • Having smart users helps (the current ones are) • The system may be too complex, requiring too much work by the user? • Or the (virtual) help desk may not be enough? • Or the documentation may be misleading? Collaboration Meeting
2005: Functional Tests2006: File Transfers total grid sites number of sites passing the SFT tests Log data lost Successful Year of Functional Tests, with bar raised throughout the year. “Functional” ≠ “Performant”. Need to test network-file transfers -file placement-file collection transfers, working with experiments. Collaboration Meeting
Come together.. • Physics discovery requires many elements to work.. The Icemen Cometh 2009 2008 2007 2008-09 “Physics discovery” 2006-07 “Performance Testing” 2004-05 “Functional Testing” 2006 2005 2004 Collaboration Meeting
A (Light) Summary Priors indicate a Higgs mass 114.4 < mH [GeV] <219 The Grid Service will be launched in April 2007 The Detectors will be complete in June 2007 The LHC will provide first collisions in Summer 2007 These will enable data analyses such that the Higgs will be discovered on May 29th 2009 probably… large corrections If the Higgs particle is discovered, the Grid will be one of three major components Collaboration Meeting
2005: The Year of the Tier-1 • “World's biggest grid seeks secrets of the universe” • “It's in Didcot and it's running on open source” • Silicon.Com • Published: Thursday 24 November 2005 • http://www.silicon.com/publicsector/0,3800010403,39154492,00.htm Collaboration Meeting
Summary • This meeting focuses on the challenges of a New Year • The Old Year was the year of the (Tier-1) rooster • Tier-1/A utilisation was too low (at the start) and too high (in the end), but just right overall? • Tier-2 utilisation was too low (throughout) • The New Year is the year of the (Tier-2) dog? • The seamless vision of a T0-T1-T2 structure hidden behind a transparent Grid requires (relatively rapid) testing • The New Year will be better, if we resolve to make • (Many) T1 T2, Grid Expt. Performance Tests • we have (but they must all ultimately be successful) Collaboration Meeting