320 likes | 411 Views
Information Technology Revolution! Enabling Grids for E-Science in Europe Fabrizio Gagliardi EGEE Project Director. Ireland, April 2004. EGEE is a project funded by the European Union under contract IST-2003-508833. Background.
E N D
Information Technology Revolution!Enabling Grids for E-Science in EuropeFabrizio GagliardiEGEE Project Director Ireland, April 2004 EGEE is a project funded by the European Union under contract IST-2003-508833
Background • Technology evolution has made science more digital oriented and dominated by data. From this the term of “data intensive” science • Networking, commodity computing and distributed software tools became ripe for Grid technology to start to become available at the end of the 1990’s • Grid computing a key activity of the EU programmes • Many public funded projects (in the US and in the EU) launched since • We are ready for a IT revolution! Ireland, April 2004 - 2
Data intensive sciences • Physics/Astronomy (data from different kinds of research instruments) • Medical/Healthcare(imaging, diagnosis and treatment ) • Bioinformatics(study of the human genome and proteome to understand genetic diseases) • Nanotechnology(design of new materials from the molecular scale) • Engineering(design optimization, simulation, failure analysis and remote Instrument access and control) • Natural Resources and the Environment(weather forecasting, earth observation, modeling and prediction of complex systems: river floods and earthquake simulation) Ireland, April 2004 - 3
What is the Grid? (I) • The World Wide Web provides seamless access to information that is stored in many millions of different geographical locations • In contrast, the Grid is a new computing infrastructure which provides seamless access to computing power and data distributed over the globe • The name Grid is chosen by analogy with the electric power grid: plug-in to computing power without worrying where it comes from, like a toaster Ireland, April 2004 - 4
What is the Grid? (II) • The Grid The Grid relies on advanced software, called middleware, which ensures seamless communication between different computers and different parts of the world • The Grid search engine will not only find the data the scientist needs, but also the data processing techniques and the computing power to carry them out • It will distribute the computing task to wherever in the world there is available capacity, and send the result back to the scientist Ireland, April 2004 - 5
Grid Challenges • Share data between thousands of scientists with multiple interests • Link major computer centres, not just PCs • Ensure all data accessible anywhere, anytime • Grow rapidly, yet remain reliable for more than a decade • Cope with different management policies of different centres • Ensure data security: more is at stake than just money! Ireland, April 2004 - 6
The Vision • An international network of scientists will be able to model a new flood of the Danube in real time, using meteorological and geological data from several centers across Europe • A team of engineering students will be able to run the latest 3D rendering programs from their laptops using the Grid. • A geneticist at a conference, inspired by a talk she hears, will be able to launch a complex bio-molecular simulation from her mobile phone Access to a production quality GRID will change the way science and much else is done Ireland, April 2004 - 7
Who will use the Grid? • Computational scientists & engineers: large scale modeling of complex structures • Experimental scientists: storing and analyzing large data sets • Collaborations: large scale multi-institutional projects • Corporations: global enterprises and industrial partnership • Environmentalists: climate monitoring and modeling • Training & education: virtual learning rooms and laboratories Ireland, April 2004 - 8
Prototypes: DataGrid (I) • 9.8 M Euros EU funding over 3 years (twice as much from partners) • 90% for middleware and applications • 3 major applications: High Energy Physics, Earth Observation, Biomedical • Total of 21 partners, over 150 scientists, engineers and programmers from research and academic institutes as well as industrial companies • Three year phased developments & demos (2001-2003) • Several improved versions of middleware software (final release end 2003) • Software used by partner projects: DataTAG, CROSSGRID, GRACE Successful Final Review in February 2004 Ireland, April 2004 - 9
Prototypes: DataGrid (II) • DataGrid testbed: more than 1000 CPUs at more than 15 sites (up to 40) • Connections made possible by the EU-funded GEANT project • connecting more than 30 countries across Europe • speeds of up to 10 Gbit/s • high data throughput • quality of Service Ireland, April 2004 - 10
GriPhyN PPDG iVDGL International Grid Projects Ireland, April 2004 - 11
EGEE: Why? • Current Grid R&D projects run to completion within the next few months or next year • The EGEE partners have already made major progress in aligning national and regional Grid R&D efforts, in preparation for EGEE • EGEE will preserve the current strong momentum of the European Grid community and the enthusiasm of the hundreds of young European researchers already involved in EU Grid projects (>150 in EDG alone) Ireland, April 2004 - 12
EGEE manifesto:Enabling Grids for E-science in Europe • Goal • Create a wide European Grid production quality infrastructure on top of present and future EU RN infrastructure • Build On: • EU and EU member states major investments in Grid Technology • International connections (US and AP) • Several pioneering prototype results • Large Grid development teams in EU require major EU funding effort • Approach • Leverage current and planned national and regional Grid programmes • Work closely with relevant industrial Grid developers, NRENs and US-AP projects Applications Grid infrastructure Geant network Ireland, April 2004 - 13
EGEE: Partners • Leverage national resources in a more effective way for broader European benefit • 70 leading institutions in 27 countries, federated in regional Grids Ireland, April 2004 - 14
EGEE Applications • EGEE Scope : ALL-Inclusive for academic applications (open to industrial and socio-economic world as well) • The major success criterion of EGEE: how many satisfied users from how many different domains ? • 5000 users (3000 after year 2) from at least 5 disciplines • Two pilot applications selected to guide the implementation and certify the performance and functionality of the evolving infrastructure: Physics & Bioinformatics Application domains and timelines are for illustration only Ireland, April 2004 - 15
The pilot applications • High Energy Physics with LHC Computing Grid (www.cern.ch/lcg) relies on a Grid infrastructure to store and analyse petabytes (1015 bytes) of real and simulated data. LCG is a major source of resources, requirements and a hard deadlines with no conventional solution available • In Biomedics several communities are facing equally daunting challenges to cope with the flood of bioinformatics and healthcare data. Need to access large and distributed non-homogeneous data and important on-demand computing requirements Ireland, April 2004 - 16
Why High Energy Physics? • CERN is building the Large Hadron Collider (LHC) the most powerful instrument ever built to investigate elementary particles • LHC will collide beams of protons at very high energy (14 TeV!) • Using the latest super-conducting technologies, it will operate at about – 3000C0, just above absolute zero of temperature • With its 27 km circumference, the accelerator will be the largest superconducting installation in the world Ireland, April 2004 - 17
LHC Challenges • The computational requirements of the experiments that will use the LHC are enormous: 12-14 PetaBytes (1015 bytes) of data will be generated each year, the equivalent of more than 20 million CDs • Where will the experiments store all of these data? • LHC data analysis requires a computing power equivalent to ~ 100,000 of today's fastest PC processors! • Where will the experiments find such a computing power? Ireland, April 2004 - 18
LCG • LCG: a collaboration of • The LHC experiments • The Regional Computing Centres • Physics institutes • Mission: • Prepare and deploy the computing environment that will be used by the experiments to analyse the LHC data • Strategy: • Integrate thousands of computers at dozens of participating institutes worldwide into a global computing resource • Rely on software being developed in advanced grid technology projects, both in Europe and in the USA Ireland, April 2004 - 19
LCG Testbed • PIC-Barcelona • IFIC Valencia • Ciemat Madrid • UAM Madrid • USC Santiago de Compostela • UB Barcelona • IFCA Santander • BNL • Budapest • CERN • CNAF • Torino • Milano • FNAL • FZK • Krakow • Moscow • Prague • RAL • Imperial C. • Cavendish • Taipei • Tokyo Sites to enter soon CSCS Switzerland, Lyon, NIKHEF More tier2 centres in Italy, UK Sites preparing to join Pakistan, Sofia Ireland, April 2004 - 20
CPU servers Disk servers 2.5 MW Power Tape silos and servers Computer Centers Ireland, April 2004 - 21
EGEE Network • The EGEE infrastructure will be built on the EU Research Network GEANT • The infrastructure will provide interoperability with other Grids around the globe, including the US and Asia, contributing to efforts to establish a worldwide Grid infrastructure • The core infrastructure of the EGEE grid will be operated as a single service, and will grow out of LCG service Ireland, April 2004 - 22
EGEE Activities 24% Joint Research 28% Networking • JRA1: Middleware Engineering and Integration • JRA2: Quality Assurance • JRA3: Security • JRA4: Network Services Development • NA1:Management • NA2:Dissemination and Outreach • NA3: User Training and Education • NA4:Application Identification and Support • NA5:Policy and International Cooperation 32 M Euros EU funding (2004-5), O(100 M) total budget. Emphasis in EGEE is on operating a production grid and supporting the end-users. 48% Services • SA1: Grid Operations, Support and Management • SA2: Network Resource Provision Ireland, April 2004 - 23
EGEE “Virtuous Cycle” A new scientific community makes first contacts to EGEE through outreach events organized by Networking Activities Follow-up meetings by applications specialists may lead to definition of new requirements for the infrastructure Peer communication and dissemination events featuring established users then attract new communities If approved, the requirements are implemented by the Middleware Activities The Networking Activities then provide appropriate training to the community in question, so that it becomes an established user After integration and testing, the new middleware is deployed by the Service Activities Ireland, April 2004 - 24
Regional Support Regional Support Regional Support Resource Center Resource Center Resource Center EGEE Operations Structure Operations Center Infrastructure Regional Support Center (Support for Applications Local Resources) Resource Center (Processors, disks) Grid server Nodes Ireland, April 2004 - 25
EGEE Service Activity (I) • Create, operate, support and manage a production quality infrastructure • Offered services: • Middleware deployment and installation • Software and documentation repository • Grid monitoring and problem tracking • Bug reporting and knowledge database • VO services • Grid management services Ireland, April 2004 - 26
EGEE Service Activity (II) • Resource Centers Month 1: 10 Month 15: 20 Ireland, April 2004 - 27
EGEE Middleware Activity • Hardening and re-engineering of existing middleware functionality, leveraging the experience of partners • Activity concentrated in few major centers • Key services: Resource Access • Data Management (CERN) • Information Collection and Accounting (UK) • Resource Brokering (Italy) • Quality Assurance (France) • Grid Security (Northern Europe) • Middleware Integration (CERN) • Middleware Testing (CERN) Ireland, April 2004 - 28
EGEE Networking Activity • Dissemination and outreach • Lead by TERENA • User training and induction • Lead by Unv Edin. (NeSC) • Application identification and support • Two pilot application centers (for high energy physics and biomedical grids) • One more generic component dealing with longer term recruitment and support of other communities • Policy and International cooperation • Establish Grid policy forum • Coordinate relations with other projects (EU and beyond) map points indicate federations and are not geographically precise Ireland, April 2004 - 29
EGEE and Industry • Industry will benefit from EGEE in several ways: • as partner • through collaboration with individual EGEE partners, participate in specific activities where relevant skills and manpower are available increase know-how on Grid technologies • as user • specific industrial sectors will be targeted as potential users of the Grid infrastructure for R&D applications • particularly attractive to high-tech SMEs (major computing resources within grasp) • as provider • long-term maintenance of established Grid services (call centres, support centres and computing resource provider centres) Ireland, April 2004 - 30
Conclusions • EGEE is expected to deliver a production Grid infrastructure for scientific applications in Europe • This will allow both more cost effective solutions to current understood problems and tackling of problems which were considered so far too difficult to approach • This will be a tremendous opportunity for European science first and eventually also for commerce and industry • In the Irish European presidency it was only too natural to choose to inaugurate EGEE in this country with the first EGEE conference starting on April 18th in Cork Ireland, April 2004 - 31
Further Information To know more: Come to Cork! EU EGEE – www.eu-egee.org EU DataGrid – www.eu-edg.org Other Grid projects – www.gridstart.org Ireland, April 2004 - 32