1 / 58

NERSC Status and Plans for the NERSC User Group Meeting February 22,2001 BILL KRAMER

NERSC Status and Plans for the NERSC User Group Meeting February 22,2001 BILL KRAMER DEPUTY DIVISION DIRECTOR DEPARTMENT HEAD, HIGH PERFORMANCE COMPUTING DEPARTMENT kramer@nersc.gov 510-486-7577. Agenda. Update on NERSC activities IBM SP Phase 2 status and plans NERSC-4 plans

barbra
Download Presentation

NERSC Status and Plans for the NERSC User Group Meeting February 22,2001 BILL KRAMER

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NERSC Status and Plans for the NERSC User Group MeetingFebruary 22,2001 BILL KRAMER DEPUTY DIVISION DIRECTOR DEPARTMENT HEAD, HIGH PERFORMANCE COMPUTING DEPARTMENT kramer@nersc.gov 510-486-7577

  2. Agenda • Update on NERSC activities • IBM SP Phase 2 status and plans • NERSC-4 plans • NERSC-2 decommissioning

  3. ACTIVITIES AND ACCOMPLISHMENTS

  4. NERSC Facility Mission To provide reliable, high-quality,state-of-the-art computing resourcesand client support in a timely manner—independent of client location–while wisely advancing the state of computational and computer science.

  5. 2001 GOALS PROVIDE RELIABLE AND TIMELY SERVICE Systems Gross Availability, Scheduled Availability, MTBF/MTBI, MTTR Services Responsiveness, Timeliness, Accuracy, Proactivity DEVELOP INNOVATIVE APPROACHES TO ASSIST THE CLIENT COMMUNITY EFFECTIVELY USE NERSC SYSTEMS. DEVELOP AND IMPLEMENT WAYS TO TRANSFER RESEARCH PRODUCTS AND KNOWLEDGE INTO PRODUCTION SYSTEMS AT NERSC AND ELSEWHERE NEVER BE A BOTTLENECK TO MOVING NEW TECHNOLOGY INTO SERVICE. INSURE ALL NEW TECHNOLOGY AND CHANGES IMPROVE (OR AT LEAST DOES NOT DIMINISH) SERVICE TO OUR CLIENTS.

  6. EXCELLENT STAFF GOALS (CON’T) NERSC AND LBNL WILL BE A LEADER IN LARGE SCALE SYSTEMS MANAGEMENT & SERVICES. EXPORT KNOWLEDGE, EXPERIENCE, AND TECHNOLOGY DEVELOPED AT NERSC, PARTICULARLY TO AND WITHIN NERSC CLIENT SITES. NERSC WILL BE ABLE TO THRIVE AND IMPROVE IN AN ENVIRONMENT WHERE CHANGE IS THE NORM IMPROVE THE EFFECTIVENESS OF NERSC STAFF BY IMPROVING INFRASTRUCTURE, CARING FOR STAFF, ENCOURAGING PROFESSIONALISM AND PROFESSIONAL IMPROVEMENT NEW TECHNOLOGY TIMELYINFORMATION INNOVATIVE ASSISTANCE RELIABLE SERVICE TECHNOLOGY TRANSFER SUCCESS FOR CLIENTS AND FACILITY CONSISTENT SERVICE & SYSTEM ARCHITECTURE MISSION NEW TECHNOLOGY LARGE SCALE LEADER STAFF EFFECTIVENESS CHANGE RESEARCH FLOW WISE INTEGRATION

  7. Major AccomplishmentsSince last meeting (June 2000) • IBM SP placed into full service April 4, 2000 – more later • Augmented the allocations by 1 M Hours in FY 2000 • Contributed to 11M PE hours in FY 2000 – more than doubling the FY 2000 allocation • SP is fully utilized • Moved entire facility to Oakland – more later • Completed the second PAC allocation process with lessons learned from the first year

  8. Activities and Accomplishments • Improved Mass Storage System • Upgraded HPSS • New versions of HSI • Implementing Gigabit Ethernet • Two STK robots added • Replaced 3490 with 9840 tape drives • Higher density and Higher speed tape drives • Formed Network and Security Group • Succeeded in external reviews • Policy Board • SCAC

  9. Activities and Accomplishments • Implemented new accounting system – NIM • Old system was • Difficult to maintain • Difficult to integrate to new system • Limited by 32 bits • Not Y2K compliant • New system • Web focused • Available data base software • Works for any type of system • Thrived in a state of increased security • Open Model • Audits, tests

  10. 2000 Activities and Accomplishments • NERSC firmly established as a leader in system evaluation • Effective System Performance (ESP) recognized as a major step in system evaluation and is influencing a number of sites and vendors • Sustained System Performance measures • Initiated a formal benchmarking effort to the NERSC Application Performance Simulation Suite (NAPs) to possibly be the next widely recognized parallel evaluation suite

  11. Activities and Accomplishments • Formed the NERSC Cluster team to investigate the impact of SMP commodity clusters for High Performance, Parallel Computing to assure the most effective implementations of division resources related to cluster computing. • Coordinate all NERSC Division cluster computing activities (research, development, advanced prototypes, pre-production, production, and user support). • Initiated a formal procurement for mid-range cluster • In consultation with DOE, decided not to award as part of NERSC program activities

  12. HORST SIMON Division Director WILLIAM KRAMER Deputy Director NERSC Division DIVISION ADMINISTRATOR & FINANCIAL MANAGER WILLIAM FORTNEY CHIEF TECHNOLOGIST DAVID BAILEY DISTRIBUTED SYSTEMS DEPARTMENT WILLIAM JOHNSTON Department Head DEB AGARWAL, Deputy HIGH PERFORMANCE COMPUTING DEPARTMENT WILLIAM KRAMER Department Head HIGH PERFORMANCE COMPUTING RESEARCH DEPARTMENT ROBERT LUCAS Department Head APPLIED NUMERICAL ALGORITHMS PHIL COLELLA IMAGING & COLLABORATIVE COMPUTING BAHRAM PARVIN COLLABORATORIES DEB AGARWAL ADVANCED SYSTEMS TAMMY WELCOME HENP COMPUTING DAVID QUARRIE CENTER FOR BIOINFORMATICS & COMPUTATIONAL GENOMICS MANFRED ZORN DATA INTENSIVE DIST. COMPUTING BRIAN TIERNEY (CERN) WILLIAM JOHNSTON (acting) COMPUTATIONAL SYSTEMS JIM CRAW MASS STORAGE NANCY MEYER SCIENTIFIC COMPUTING ESMOND NG DISTRIBUTED SECURITY RESEARCH MARY THOMPSON COMPUTER OPERATIONS & NETWORKING SUPPORT WILLIAM HARRIS USER SERVICES FRANCESCA VERDIER CENTER FOR COMPUTATIONAL SCIENCE & ENGR. JOHN BELL SCIENTIFIC DATA MANAGEMENT ARIE SHOSHANI SCIENTIFIC DATA MGMT RESEARCH ARIE SHOSHANI NETWORKING WILLIAM JOHNSTON (acting) FUTURE INFRASTRUCTURE NETWORKING & SECURITY HOWARD WALTER FUTURE TECHNOLOGIES ROBERT LUCAS (acting) VISUALIZATION ROBERT LUCAS (acting) Rev: 02/01/01

  13. HIGH PERFORMANCE COMPUTING DEPARTMENT WILLIAM KRAMER Department Head ADVANCED SYSTEMS TAMMY WELCOME Greg Butler Thomas Davis Adrian Wong COMPUTATIONAL SYSTEMS JAMES CRAW Terrence Brewer (C) Scott Burrow (I) Tina Butler Shane Canon Nicholas Cardo Stephan Chan William Contento (C) Bryan Hardy (C) Stephen Luzmoor (C) Ron Mertes (I) Kenneth Okikawa David Paul Robert Thurman (C) Cary Whitney COMPUTER OPERATIONS & NETWORKING SUPPORT WILLIAM HARRIS Clayton Bagwell Jr. Elizabeth Bautista Richard Beard Del Black Aaron Garrett Mark Heer Russell Huie Ian Kaufman Yulok Lam Steven Lowe Anita Newkirk Robert Neylan Alex Ubungen FUTURE INFRASTRUCTURE NETWORKING & SECURITY HOWARD WALTER Eli Dart Brent Draney Stephen Lau HENP COMPUTING DAVID QUARRIE * CRAIG TULL(Deputy) Paolo Calafiura Christopher Day Igor Gaponenko Charles Leggett (P) Massimo Marino Akbar Mokhtarani Simon Patton MASS STORAGE NANCY MEYER Harvard Holmes Wayne Hurlbert Nancy Johnston Rick Un (V) USER SERVICES FRANCESCA VERDIER Mikhail Avrekh Harsh Anand Majdi Baddourah Jonathan Carter Tom DeBoni Jed Donnelley Therese Enright Richard Gerber Frank Hale John McCarthy R.K. Owen Iwona Sakrejda David Skinner Michael Stewart (C) David Turner Karen Zukor (C) Cray (FB) Faculty UC Berkeley (FD) Faculty UC Davis (G) Graduate Student Research Assistant (I) IBM (M) Mathematical Sciences Research Institute (MS) Masters Student (P) Postdoctoral Researcher (SA) Student Assistant (V) Visitor * On leave to CERN Rev: 02/01/01

  14. HIGH PERFORMANCE COMPUTING RESEARCH DEPARTMENT ROBERT LUCAS Department Head APPLIED NUMERICAL ALGORITHMS PHILLIP COLELLA Susan Graham (FB) Anton Kast Peter McCorquodale (P) Brian Van Straalen Daniel Graves Daniel Martin (P) Greg Miller (FD) IMAGING & COLLABORATIVE COMPUTING BAHRAM PARVIN Hui H Chan (MS) Gerald Fontenay Sonia Sachs Qing Yang Ge Cong (V) Masoud Nikravesh (V) John Taylor SCIENTIFIC COMPUTING ESMOND NG Julian Borrill Xiaofeng He (V) Jodi Lamoureux (P) Lin-Wang Wang Andrew Canning Yun He Sherry Li Michael Wehner (V) Chris Ding Parry Husbands (P) Osni Marques Chao Yang Tony Drummond Niels Jensen (FD) Peter Nugent Woo-Sun Yang (P) Ricardo da Silva (V) Plamen Koev (G) David Raczkowski (P) CENTER FOR BIOINFORMATICS & COMPUTATIONAL GENOMICS MANFRED ZORN Donn Davy Inna Dubchak * Sylvia Spengler SCIENTIFIC DATA MANAGEMENT ARIE SHOSHANI Carl Anderson Andreas Mueller Ekow Etoo M. Shinkarsky (SA) Mary Anderson Vijaya Natarajan Elaheh Pourabbas (V) Alexander Sim Junmin Gu Frank Olken Arie Segev (FB) John Wu Jinbaek Kim (G) CENTER FOR COMPUTATIONAL SCIENCE & ENGINEERING JOHN BELL Ann Almgren William Crutchfield Michael Lijewski Charles Rendleman Vincent Beckner Marcus Day FUTURE TECHNOLOGIES ROBERT LUCAS (acting) David Culler (FB) Paul Hargrove Eric Roman Michael Welcome James Demmel (FB) Leonid Oliker Erich Stromeier Katherine Yelick (FB) VISUALIZATION ROBERT LUCAS (acting) Edward Bethel James Hoffman (M) Terry Ligocki Soon Tee Teoh (G) James Chen (G) David Hoffman (M) John Shalf Gunther Weber (G) Bernd Hamann (FD) Oliver Kreylos (G) (FB) Faculty UC Berkeley (FD) Faculty UC Davis (G) Graduate Student Research Assistant (M) Mathematical Sciences Research Institute (MS) Masters Student (P) Postdoctoral Researcher (S) SGI (SA) Student Assistant (V) Visitor * Life Sciences Div. – On Assignment to NSF Rev: 02/01/01

  15. FY00 MPP Users/Usage by Discipline

  16. FY00 PVP Users/Usage by Discipline

  17. NERSC FY00 MPP Usage by Site

  18. NERSC FY00 PVP Usage by Site

  19. FY00 MPP Users/Usage by Institution Type

  20. FY00 PVP Users/Usage by Institution Type

  21. HPSS HPSS SGI HIPPI ESnet NERSC System Architecture FDDI/ ETHERNET 10/100/Gigbit REMOTE VISUALIZATION SERVER MAX STRAT SYMBOLIC MANIPULATION SERVER IBM And STK Robots DPSS PDSF ResearchCluster IBM SP NERSC-3 Processors 604/304 Gigabyte memory CRI T3E 900 644/256 CRI SV1 MILLENNIUM LBNL Cluster IBM SP NERSC-3 – Phase 2a 2532 Processors/ 1824 Gigabyte memory VIS LAB

  22. Current Systems

  23. Major Systems • MPP • IBM SP – Phase 2a • 158 16-way SMP nodes • 2144 Parallel Application CPUs/12 GB per node • 20 TB Shared GPFS • 11,712 GB swap space - local to nodes • ~8.6 TB of temporary scratch space • 7.7 TB of permanent home space • 4-20 GB home quotas • ~240 Mbps aggregate I/O - measured from user nodes (6 HiPPI, 2 GE, 1 ATM) • T3E-900 LC with 696 PEs - UNICOS/mk • 644 Application Pes/256 MB per PE • 383 GB of Swap Space - 582 GB Checkpoint File System • 1.5 TB /usr/tmp temporary scratch space - 1 TB permanent home space • 7- 25 GB home quotas, DMF managed • ~ 35 MBps aggregate I/O measured from user nodes - (2 HiPPI, 2 FDDI) • 1.0 TB local /usr/tmp • Serial • PVP - Three J90 SV-1 Systems running UNICOS • 64 CPUs Total/8 GB of Memory per System (24 GB total) • 1.0 TB local /usr/tmp • PDSF - Linux Cluster • 281 IA-32 CPUs • 3 LINUX and 3 Solaris file servers • DPSS integration • 7.5 TB aggregate disk space • 4 striped fast Ethernet connections to HPSS • LBNL – Mid Range Cluster • 160 IA-32 CPUs • LINUX with enhancements • 1 TB aggregate disk space • Myrinet 2000 Interconnect • GigaEthernet connections to HPSS • Storage • HPSS • 8 STK Tape Libraries • 3490 Tape drives • 7.4 TB of cache disk • 20 HiPPI Interconnects, 12 FDDI connects, 2 GE connections • Total Capacity ~ 960TB • ~160 TB in use • HPSS - Probe

  24. T3E Utilization95% Gross Utilization Allocation Starvation Full Scheduling Functionality 4.4% improvement per month Checkpoint t - Start of Capability Jobs Allocation Starvation Systems Merged

  25. SP Utilization • In the 80-85% range which is above original expectations for first year • More variation than T3E

  26. T3E Job Size More than 70% of the jobs are “large”

  27. SP Job Size Full size jobs more than 10% of usage ~ 60% of the jobs are > ¼ the maximum size

  28. Storage:HPSS

  29. NERSC Network Architecture

  30. CONTINUE NETWORK IMPROVEMENTS

  31. LBNL Oakland Scientific Facility

  32. Oakland Facility • 20,000 sf computer room; 7,000 sf office space • 16,000 sf computer space built out • NERSC occupying 12,000 sf • Ten year lease with 3 five year options • $10.5M computer room construction costs • Option for additional 20,000+ sf computer room

  33. LBNL Oakland Scientific Facility Move accomplished between Oct 26 to Nov 4 System Scheduled Actual SP 10/27 – 9 am no outage T3E 11/3 – 10 am 11/3 – 3 am SV1’s 11/3 – 10 am 11/2 – 3 pm HPSS 11/3 – 10 am 10/31 – 9:30 am PDSF 11/6 – 10 am 11/2 – 11 am Other Systems 11/3 – 10 am 11/1 – 8 am

  34. Computer Room Layout Up to 20,000 sf of computer space Direct Esnet node at OC12

  35. 2000 Activities and Accomplishments • PDSF Upgrade in conjunction with building move

  36. 2000 Activities and Accomplishments • netCDF parallel support developed by NERSC staff for the Cray T3E. • A similar effort is being planned to port netCDF to the IBM SP platform. • Communication for Clusters: M-VIA and MVICH • M-VIA and MVICH are VIA-based software for low-latency, high-bandwidth, inter-process communication. • M-VIA is a modular implementation of the VIA standard for Linux. • MVICH is an MPICH-based implementation of MPI for VIA.

  37. FY 2000 User Survey Results • Areas of most importance to users • available hardware (cycles) • overall running of the center • network access to NERSC • allocations process • Highest satisfaction (score > 6.4) • Problem reporting/consulting services (timely response, quality, followup) • Training • Uptime (SP and T3E) • FORTRAN (T3E and PVP) • Lowest satisfaction (score < 4.5) • PVP batch wait time • T3E batch wait time • Largest increases in satisfaction from FY 1999 • PVP cluster (we introduced interactive SV1 services) • HPSS performance • Hardware management and configuration (we monitor and improve this continuously) • HPCF website (all areas are continuously improved, with a special focus on topics highlights as needing improvement in the surveys) • T3E Fortran compilers

  38. Client Comments from Survey "Very responsive consulting staff that makes the user feel that his problem, and its solution, is important to NERSC" "Provide excellent computing resources with high reliability and ease of use." "The announcement managing and web-support is very professional." "Manages large simulations and data. The oodles of scratch space on mcurie and gseaborg help me process large amounts of data in one go." "NERSC has been the most stable supercomputer center in the country particularly with the migration from the T3E to the IBM SP". "Makes supercomputing easy."

  39. NERSC 3 Phase 2a/b

  40. Result: NERSC-3 Phase 2a • System built and configured • Started factory tests 12/13 • Expect delivery 1/5 • Undergoing acceptance testing • General production April 2001 • What is different that needs testing • New Processors, • New Nodes, New memory system • New switch fabric • New Operating System • New parallel file system software

  41. IBM Configuration Phase 1 Phase 2a/b Compute Nodes 256 134* Processors 256x2=512 134x16=2144* Networking Nodes 8 2 Interactive Nodes 8 2 GPFS Nodes 16 16 Service Nodes 16 4 Total Nodes (CPUs) 304 (604) 158 (2528) Total Memory (compute nodes) 256 GB 1.6 TB Total Global Disk (user accessible) 10 TB 20 TB Peak (compute nodes) 409.6 GF 3.2 TF* Peak (all nodes) 486.4 GF 3.8 TF* Sustained System Perf 33 GF 235+ GF/280+ GF Production Dates April 1999 April 2001/Oct 2001 *is a minimum - may increase due to sustained system performance measure

  42. What has been completed • 6 nodes added to the configuration • Memory per node increased to 12 GB for 140 compute nodes • “Loan” of full memory for Phase 2 • System installed and braced • Switch adapters and memory added to system • System configuration • Security Audit • System testing for many functions • Benchmarks being run and problems being diagnosed

  43. Current Issues • Failure of two benchmarks need to be resolved • Best case: indicate broken hardware – likely with the switch adaptors • Worst case: indicate design and load issues that are fundamental • Variation • Loading and switch contention • Remaining tests • Throughput, ESP • Full System tests • I/O • Functionality

  44. General Schedule • Complete testing – TBD based on problem correction • Production Configuration set up • 3rd party s/w, local tools, queues, etc • Availability Test • Add early users ~10 days after successful testing complete • Gradually add other users – complete ~ 40 days after successful testing • Shut down Phase 1 ~ 10 days after system open to all users • Move 10 TB of disk space – configuration will require Phase 2 downtime • Upgrade to Phase 2b in late summer, early fall

  45. NERSC-3 Sustained System Performance Projections • Estimates the amount scientific computation that can really be delivered • Depends on delivery of Phase 2b functionality • The higher the last number is the better since the system remains at NERSC for 4 more years Test/Config, Acceptance,etc Software lags hardware

  46. NERSC Computational Power vs. Moore’s Law

  47. NERSC 4

  48. NERSC-4 • NERSC 4 IS ALREADY ON OUR MINDS • PLAN IS FOR FY 2003 INSTALLATION • PROCUREMENT PLANS BEING FORMULATED • EXPERIMENTATION AND EVALUATION OF VENDORS IS STARTING • ESP, ARCHITECTURES, BRIEFINGS • CLUSTER EVALUATION EFFORTS • USER REQUIREMENTS DOCUMENT (GREENBOOK) IMPORTANT

  49. How Big Can NERSC-4 be • Assume a delivery in FY 2003 • Assume no other space is used in Oakland until NERSC-4 • Assume cost is not an issue (at least for now) • Assume technology still progresses • ASCI will have a 30 Tflop/s system running for over 2 years

  50. How close is 100 Tflop/s • Available gross space in Oakland is 3,000 sf without major changes • Assume it is 70% usable • The rest goes to air handlers, columns, etc. • That gives 3,000 sf of space for racks • IBM system used for estimates • Other vendors are similar • Each processor is 1.5 Ghz, to yield 6 Gflop/s • An SMP node is made up of 32 processors • 2 Nodes in a frame • 64 processors in a frame = 384 Gflops per frame. • Frames are 32 - 36" wide and 48” deep • service clearance of 3 feet in front and back (which can overlap) • 3 by 7 is 21 sf per frame

More Related