430 likes | 530 Views
Grid Challenges It’s the vision , stupid …but it NEEDS TO be followed by operational standards based on real applications …. The Global Grid Forum 25 June 2003 Gordon Bell Microsoft Corporation. A quick look at some past visions and a challenge. NREN >> Internet WWW
E N D
Grid ChallengesIt’s the vision, stupid…but it NEEDS TO be followed by operational standardsbased on real applications… The Global Grid Forum 25 June 2003 Gordon Bell Microsoft Corporation
A quick look at some past visionsand a challenge • NREN >> Internet • WWW • Challenge: Will match any Grid enabled application that wins a Gordon Bell Prize for parallelism
FCCSET NREN Plan 11/1987 10G- 1G- 100M- 10M- 1M- 100K- 10K- 3 G Optical a factor of 1000 makes a difference 45 M Phase 2 1.5 M Phase 1 56K 1988 1990 1992 1994 1996 1998 2000
Originating Bandwidth (Gb/s)U.S. Interstate Comm. traffic L Roberts ’92ARPAnet Goals c1972 = Grid Goals 10,000- 1,000- 100- 10- 1- Video Conf. Voice Video on Demand Email NSF bb• FAX Broadcast TV |1990 | |2000 | |2010 | |2020
Growth in hype vs reality WWW books, newspapers Infoway regulation Infoway speculation “how great it’ll be” (politicians , telecoms & futurists) Infoway addiction conferences lawsuits c 1995 Data from Gordon’s WAG
Articles per newspaper versusorders per second sent via Internet orders per second articles per newspaper c 1995 Data from Gordon’s WAG
Articles about security, privacy, & fraud versus commerce ($M) actual commerce articles about risk and NOT doing commerce organized crime on Internet c 1995 Data from Gordon’s WAG
Increased Demand Increase Capacity(circuits & bw) Create new service Lower response time WWW Audio Video Grids Voice! The virtuous cycle of bandwidth supply and demand Standards IP Telnet & FTP EMAIL Video Conf. FTP Web Svcs
Grid Book c1998 from 1996 www.mkp.com/grids The Globus Project™ www.globus.org OGSA www.globus.org/ogsa Global Grid Forum www.gridforum.org Grid Computing 2003 For More Information 651 pp. 22 chapters, 41 authors 1080 pages 43 chapters, O(100) authors
Progress...a review • Grid started out with great promise…c1998Interesting use at NASA for coupled programs • NMI (National Middleware Infrastructure)…State_Tools.gov, funded by NSF.govclearly open, clearly not “free” not IETF model • Tools vs. standards & evolving working code • Some examples: • C1980: Seti@home, folding@home, >> Napster p2p • 2001 15 TB Terraserver > Terraservice w/Web Services • 2003 Alex Szelay & Jim Gray: Skyserver/skyservice • Cornell Theory Center Web Services based apps • NEES—good poster child. An XML task • GRADs and Teragrid… dream or research or just $$s?
To the rescue! TerraServer Experiencec2001 • Successful Web Site • 50,000 daily users satisfied with “human-accessible” data • 59 GB imagery transmitted daily • New Feature Requests • Programmable access to meta-data • User selectable image sizes, i.e. “a map server” • Permission to use TerraServer data within server applications
Smart Clients WindowsForms .NET Framework ADO.NET .NET TerraService Architecture HTML Map UI Web Forms Standard Browsers Image/jpeg Existing DB Server Map Server Http Handler 668 m Rows SQL 20001.0 TB Db Image/jpeg TerraServer Web Service SQL 20001.0 TB Db XML SQL 20001.0 TB Db OLEDB
Data Intensive Science: the Next Frontier The W.M. Keck Fellowsin Advanced Scientific Data Analysis Alex SzalayThe Johns Hopkins UniversityDepartment of Physics and Astronomy
National Virtual Observatory • NSF ITR project, “Building the Framework for the National Virtual Observatory” is a collaboration of 17 funded and 3 unfunded organizations • Astronomy data centers • National observatories • Supercomputer centers • University departments • Computer science/information technology specialists • PI and project director: Alex Szalay (JHU) • CoPI: Roy Williams (Caltech/CACR)
Scientific Data Exploration • Thousand years ago: science was empirical • describing natural phenomena • Last few hundred years: theoretical branch • using models, generalizations • Last few decades: a computational branch • simulating complex phenomena • Today: data exploration is emerging • synthesizing theory, experiment and computation with advanced data management and statistics
Living in an Exponential World • Astronomers have a few hundred TB now • 1 pixel (byte) / sq arc second ~ 4TB • Multi-spectral, temporal, … → 1PB • They mine it looking fornew (kinds of) objects, more of interesting ones (quasars), density variations in 400-D space, correlations in 400-D space • Data doubles every year • Data is public after 1 year • So, 50% of the data is public • Same trend appears in all sciences
ROSAT ~keV DSS Optical IRAS 25m 2MASS 2m GB 6cm WENSS 92cm NVSS 20cm IRAS 100m Why Is Astronomy Special? • It has no commercial value • No privacy concerns, freely share results with others • Great for experimenting with algorithms • It is real and well documented • High-dimensional (with confidence intervals) • Spatial, temporal • Diverse and distributed • Many different instruments from many different places and many different times • The questions are interesting • There is a lot of it (soon petabytes) • GB: It is not over-funded aka it’s poor
Making Discoveries • When and where are discoveries made? • Always at the edges and boundaries • Going deeper, collecting more data, using more colors…. • Metcalfe’s law • Utility of computer networks grows as the number of possible connections: O(N2) • VO: Federation of N archives • Possibilities for new discoveries grow as O(N2) • Current sky surveys have proven this • Very early discoveries from SDSS, 2MASS, DPOSS
What can be learned from Sky Server? • It’s about data, not about harvesting flops • 1-2 hr. query programs versus 1 wk programs based on grep • 10 minute runs versus 3 day compute & searches • Database viewpoint. 100x speed-ups • Avoid costly re-computation and searches • Use indices and PARALLEL I/O. Read / Write >>1. • Parallelism is automatic, transparent, and just depends on the number of computers/disks. • Limited experience and talent to use dbases.
Soon: The Virtual Observatory • Many new surveys are coming • SDSS is a dry run for the next ones • LSST will be 5TB/night • All the data will be on the Internet • ftp, web services… • Data and applications will be associated with the instruments • Distributed world wide, cross-indexed • Federation is a must • Will be the best telescope in the world • World Wide Telescope • Finds the “needle in the haystack” • Successful demonstrations in Jan’03
Emerging Concepts • Standardizing distributed data access • Web Services, supported on all platforms • XML: Extensible Markup Language • SOAP: Simple Object Access Protocol • WSDL: Web Services Description Language • Standardizing distributed computing • Grid Services • Custom configure remote computing dynamically • Build your own remote computer, and discard • Virtual Data: new data sets on demand • Both needed for Data Exploration
Computational Science Simulations based on Web Services Gerd Heber Cornell Theory Center heber@tc.cornell.edu
Three Flavors of Adaptivity • Application-level • Mathematical model • High/low confidence • Algorithm-level • Discretization method • Solution technique • System-level • Resource availability • Fault tolerance
The Problem • Do distributed,coupled and adaptive multi-physics simulations of • Mechanics of chemically-reacting flows • (Damage) Thermo-Mechanics of solids • Components provided as Web Services
Geography • Cornell University • Theory Center • Department of Computer Science • Department of Civil Engineering • University of Alabama • Mississippi State University • College of William and Mary
Components • MiniCAD • Meshers • Surface (Delaunay, quality guarantees) • Volume (Dmesh, Jmesh, Gmesh) • Fluid/Thermal simulation (Loci, CHEM) • Thermo-mechanical component (CPTC) • Fracture mechanics • Visualization (OpenDX + SQL Server)
Web Services • “Web Services are self-contained, modular applications that can be described, published, located, and invoked over a network, …” (IBM) • Service oriented architecture: Publish, find, bind • XML, SOAP, UDDI, WSDL
Features and Requirements • Distributed expertise • No porting • Network accessibility (“firewall compliant”) • Platform and language neutrality • Security • Industry standards • Metadata • State • Students shouldn’t waste too much time with coding!
GrADS Vision • Build a National Problem-Solving System on the Grid • Transparent to the user, who sees a problem-solving system • Software Support for Application Development on Grids • Goal: Design and build programming systems for the Grid that broaden the community of users who can develop and run applications in this complex environment • Challenges: • Presenting a high-level application development interface* • If programming is hard, the Grid will not not reach its potential • Designing and constructing applications for adaptability • Late mapping of applications to Grid resources • Monitoring and control of performance • When should the application be interrupted and remapped? *GB note: This is a superset of the previously unsolved clusters programming problem!
Performance Feedback Real-time Performance Performance Problem Software Monitor Components Resource Config- Whole- Source Grid Negotiator urable Appli- Program Negotiation Runtime Object Compiler cation System Scheduler Program Binder Libraries GrADSoft Architecture
Network for Earthquake Eng. Simulation • NEESgrid: US national infrastructure to couple earthquake engineers with experimental facilities, databases, computers, & each other • On-demand access to experiments, data streams, computing, archives, collaboration NEESgrid: Argonne, Michigan, NCSA, UIUC, USC www.neesgrid.org
“Scales Away” spans organizations & geographies “Scales Out” by adding machines “Scales Up” on large systems “Scales In” on a machine “Scales Down” to devices A Universal Architecture for Web Services… Microsoft Vision Security Reliable Messaging Transactions Routing … Messaging Infrastructure Distributed applications Vertical processes Embedded systems Network equipment … 39
Web Services: Level IFoundation to Build Upon • Basic profile • Defined by WS-I • XML, SOAP, WSDL, UDDI • Broad vendor support • WS-I assures widespread compatibility
Level II Secure, Reliable, Transacted Connected Applications Business Process Management … Secure Reliable Transacted Metadata Messaging XML Transports
Level IIIFrom Infrastructure to Solutions • Application schemas • Domain specific profiles • Vertical industry services
Vison: Community/Data-Centric ComputingVersus Machine-Centered Centers • Goal: Enable technical communities to create and take responsibility for their own computing environments of personal, data, and program collaboration and distribution • Design based on technology and cost, e.g. networking, apps programs maintenance, databases, and providing 24x7 web and other services • Many alternative styles and locations are possible • Service from existing centers, including many state centers • Software vendors could be encouraged to supply apps web services • NCAR style center based on shared data and apps • Instrument- and model-based databases. Both central & distributed when multiple viewpoints create the whole. • Wholly distributed services supplied by many individual groups
Community/Data Centric: “web service” • Community is responsible • Planned & budget as resources • Responsible for its infrastructure • Apps are from community • Computing is integral to work • In sync with technologies • 1-3 Tflops/$M; 1-3 PBytes/$M to buy smallish Tflops & PBytes. • New scalables are “centers” • Community can afford and evolve • Dedicated to a community • Program, data & database centric • May be aligned with instruments or other community activities • Output = web service; Can communities form that can supply services?
Commitment to standards • A general architecture comes much from understanding the problems • Understanding the problems comes from actually solving such problems • This is bottom-up, based on experience • Microsoft is committed to develop community-wide web services standards… • Is the Grid Forum equally committed?
The EndHow can GRIDs become a real, useful, computer structure?Get a life.Use the standards and tools. Adopt an application and/or community…now!