1 / 24

NPACI All Hands Meeting 2003

NPACI All Hands Meeting 2003. Francine Berman Director, NPACI and SDSC HPC Chair, Department of Computer Science and Engineering berman@sdsc.edu. Welcome to the NPACI AHM!. Thrust Partners. Resource Partners. International Partners. Sponsoring Partners. Associate Partners.

aqua
Download Presentation

NPACI All Hands Meeting 2003

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NPACI All Hands Meeting 2003 Francine Berman Director, NPACI and SDSC HPC Chair, Department of Computer Science and Engineering berman@sdsc.edu

  2. Welcome to the NPACI AHM! Thrust Partners Resource Partners International Partners Sponsoring Partners Associate Partners

  3. FY’03 provides a time to both consolidate the contributions of NPACI and to look ahead to the future Two goals: Deliver on the promise of the original NPACI goals Enable NPACI to provide a solid foundation for NSF’s evolving cyberinfrastructure Focus on the Future NSF Cyberinfrastructure NPACI AHM 2002

  4. We are delivering on our promises:Goals of 1996 NPACI Proposal • Create a distributed, national metacomputing infrastructure to benefit the national community in science and engineering • Infrastructure to integrate data and computation • Deploy Teraflops-scale resources to solve problems at the forefront of numerically intensive computational science and engineering • Extend metacomputing environment to enable data-intensive computing • Integrate computational science and engineering educationactivities into the infrastructure • Pursue collaborative projects to advance computing technology

  5. Building the future: PACI activities are forming the basis for Cyberinfrastructure National Science Foundation’s Cyberinfrastructure

  6. Building a Foundation for Cyberinfrastructure Focus for 2003:An integrated systems-oriented approach for building out the hardware, software and applications for the science and engineering community

  7. Focus of NPACI Efforts in 2003 • BIG USERSLarge-scale resources (computation, data) must accommodate large-scale usage (big runs, big collections) • BIG RESULTSHigh-end resources should enable new science • BIG INFRASTRUCTURENPACI infrastructure should be both useful and usable by a substantive portion of the scientific community • NEW COMMUNITIESNPACI activities should facilitate the access of high-end facilities and infrastructure to new users

  8. Work in Progress – “Big Users” • Goal: Ensure that NPACI resources are being used maximally: big/long runs on big machines, big collections managed on large-scale data resources, wide spectrum of users (compute, data, grid) • How do we measure this? • Big users should rank highly in resource (compute, data, grid) utilization • Big grid users should use NPACI Grid for runs which demonstrate “value added” • Infrastructure-enabled applications should be demonstrable to the community at a visible venue such as SC.

  9. NPACI worked with Library of Congress to preserve and manage 8 TB (~7.5M digital objects)of “American Memory” and otherhistorically importantnational collections. Big Users on NPACI Resources FY’02 Big DataUsers Big ComputeUsers Mark Ellisman, UCSD, Global Telescience application coupled remote electron microscope in Japan with distributeddata and compute resources. Researchers at the AMNH, SDSC and NCSA, Simulated the birthplace of stars, rendering on Blue Horizon took 72 hours on 1152 processors at 1.7 trillion calculations/sec. Resulting data used in the movie, “The Search for Life: Are We Alone?” and a paper given at SC2002. BigGrid Users

  10. Big Users So Far in FY’03 Big DataUsers Big ComputeUsers The Encyclopedia of Life, is running at SDSC, Wisconsin and at international partners. Bioinformatics Institute in Singapore recently annotated the Fugu (Puffer Fish) genome. 99 out of 800 genomes have been completed so far. To process all 800 would take a single processor working non-stop about 500 years. Michael Norman, UCSD, will use 300,000 CPU hours on Blue Horizon (and more at other sites) to investigate the formation of galaxies and other cosmological phenomena. Scientists at JHU, Caltech and other institutions confirmed the discovery of a new brown dwarf. Search time on 5M files went from months to minutes using NVO DB technology. Big Grid Users

  11. Work in Progress – “Big Results” • Goal:Ensure that NPACI resources are being used to enable results that are quantifiably important to a science or engineering community • How do we measure this? • NPACI-enabled applications should demonstrate something worthy of publication in a respected venue for the relevant discipline. • Infrastructure-enabled applications should have big impact – community codes should be used by more than their developers

  12. Big Results for FY’02 Big DataResults Big ComputeResults Thomas Bartol and Terrence Sejnowski, Salk, Joel Stiles, PSC, et al. New neuroscienceresults on largest MCell models to date run on distributed clusters and Blue Horizon Proceedings of 2001 Society of Neuroscience A. Townsend Peterson and David Stockwell, U Kansas and UCSD, Large-scale integration of data sets provides projection of how climate change may rearrange Mexican ecosystems Nature, April 2002 Michael Norman, UCSDSimulation of the first star Science, November 2001 Big Grid Results

  13. Big Results So Far in FY’03 Big ComputeResults Big DataResults Rich Wolski and Wahid Chrabakh at UCSB have been developing a grid-enabled Satisfiability solver to tackle SAT2002 challenges. Such problems motivate circuit design. Code currently running at UCSD, UCSB, UIUC, and UTK and Blue Horizon. 7 previously unsolved industrial problems solved. SDSC with AAAS and more than 100 other partners helped create the National Digital Science Library (NSDL). NSDL is a "library of libraries” which will serve students and teachers at all levels of education. NSDL currently contains pointers to 1.2 TB of information, will grow arbitrarily large.Launched December 2002 Kim Baldridge at SDSC and Floyd Romesberg at TSRI investigated how antibody proteins fit together. Results are a fundamental step toward understanding how the immune system works. PNAS, January 2003 Big Grid Results

  14. Making All Objectives Easier • Announcing “DATAStar” • Large memory, very fast interconnect, oriented for innovative data intensive, grid and compute-intensive applications • Target ~7 Teraflops, IBM Power4 architecture • To be installed 3Q03 • Will support big users, big results, big infrastructure and new communities • Will be part of NPACI Grid and the operational TeraGrid • Working with users and application developers to develop configuration • New data-oriented application models: very large data, heavy I/O, Databases • New grid-oriented application models: compute at A, store data at B, distributed supercomputing, on-demand • National workshop to be held this Spring • Get input from data and other users on how best to configure, allocate, administer machine

  15. Work in Progress – “New Communities” • Goal:Expand and diversity the community of users for NPACI resources. Increase the understanding of NPACI technology. • How do we measure this? • Continuing commitment and support for EOT activities • Increased and indigenous integration of education, outreach and training activities into NPACI applications and infrastructure activities • Increased focus on portals for access and information • Strong focus on building communities of Data-, Grid- and Compute-intensive users

  16. NPACI New Communities in FY’02 • Girl Scouts of America –10,000 technology badges • Educators, Students and General Public–Simulation of Orion Nebula for “Passport to the Universe,” and a simulation of the “Hayden” Nebula for “The Search for Life: Are We Alone,” at the American Museum of Natural History’s Hayden Planetarium. Since their opening, approximately 1.5 million people have seen the two shows. • Computational biologists–PDB portal–5 million hits/month; 60 million hits/year • Under-represented groups–Tapia Celebration of Diversity, four hundred attendees; Grace Hopper Celebration of Women 2002, hundreds; AWIS (San Diego Chapter of the Association for Women in Science) website hosted by NPACI receives nearly 8000 hits/month • Teachers–TeacherTECH, attendees from Southern California; SDSU EdCenter faculty fellows working to incorporate computational science into undergraduate classes • Ecologists–Seamounts Online, a database about species found in seamount habitats (seamounts.sdsc.edu), average of 15,000 files downloaded per month

  17. SCEC Building New Communities in FY’03 New Scientists and Users New Collaborators and Partners

  18. Work in Progress – “Big Infrastructure” • Goal:Demonstrate that NPACI infrastructure is useful and usable by a broad community of scientists, engineers and other users. • How do we measure this? • NPACI software must culminate in infrastructure that is robust and usable enough to be deployed, supported and used at all NPACI resource sites • NPACI infrastructure activities should culminate in inclusion of the infrastructure in NMI or the “NPACKage” • NPACI infrastructure should have a user community beyond the developers of the infrastructure and/or application collaborators – the more users, the better.

  19. APST NPACI Community infrastructure in FY’02 • APST– parameter sweep middleware for the Grid (more than a dozen apps) • NWS– monitoring and predicting resource usage and availability (> 70 sites) • Globus– de facto Grid services standard • DataCutter–support for subseting and processing of large-scale data in distributed environments (> 100 sites) • SRB– uniform access to heterogeneous data resources (> 130 sites) • GridPort– technology collection designed to aid in the development of science and education portals on computational grids • NPACI ROCKS– software to for easy deployment, management, upgrade and scale of linux clusters (>200 sites) • Etc. DataCutter

  20. NPACKage Interoperable collection of NPACI SW targeted for national-scale distribution April ’03 Release NMI-R2 [including Globus and NWS] + DataCutter, SRB, GridPort, Ganglia, GridSolve, LAPACK For Clusters (LFC), APST Alpha version available to “friendly users” in the next couple weeks. NPACKage being integrated into Alphas and other NPACI projects Technology integration All-to-all interoperability Packaging and deployment Maintenance User support Documentation Consulting Infrastructure Development in FY’03: NPACKage

  21. A Model for National Scale SW Deployment • NPACKage provides a model for development and deployment of national scale infrastructure • Community-based integration. Complements and stages SW for NMI • Blazes an evolutionary path from prototype to usable, production-ready SW • User services, documentation, SW repository and distribution, deployment support • All software must interoperate • Result is production-ready cyber-SW, deployed nationwide

  22. NPACI Applications NPACI Grid Middleware NPACKage Common infrastructure (NMI, etc.) NPACI resources sites NPACI Resources form an “NPACI Grid” Apps should promote “big users”, “big results” and “new communities” Apps should integrate and exercise NPACI SW as appropriate integration NPACKage “big infrastructure” being robustified, documented, tested, and available at all NPACI resource sites NMI will be deployed with the latest release of NPACKage at each NPACI resource site. interoperability

  23. Resources NPACI Gridproviding criticalexperience with Cyberinfrastructure Integrationand coordinationof all activitiesis key NPACKage Applications Common Infra NPACI Grid is Cyberinfrastructure 101

  24. This is a Working Meeting • The NPACI AHM is a great place to • Build cyberinfrastructure through integrationand coordination • Learn new things • Make great contacts • Start new collaborations • Give the NPACI EC feedback • Have fun! Tour SDSC, visit the beach, check out the gorgeous “Aesthetics of Science” show at the UCSD Faculty Club … • Do you have Big News about your “Big Results,” “Big Usage,” “Big Infrastructure,” and any other “Big” and “New” activities? • Please let us know -- tellus@npaci.edu. Read about it in Online & EnVision!

More Related