1 / 16

Benchmark software for HPC systems

Benchmark software for HPC systems. Kei Hiraki The University of Tokyo. My Position. Working in “Computer Architecture ” For me, Benchmark means SPEC CPU. My Position. Working in “Computer Architecture ” For me, Benchmark means SPEC CPU But SPEC CPUint of most supercomputers are small

emysliwiec
Download Presentation

Benchmark software for HPC systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Benchmark software for HPC systems Kei Hiraki The University of Tokyo

  2. My Position • Working in “Computer Architecture” • For me, Benchmark means SPEC CPU

  3. My Position • Working in “Computer Architecture” • For me, Benchmark means SPEC CPU But SPEC CPUint of most supercomputers are small except for Intel x86 HPC is different world for measuring benchmarks

  4. Diversity of Supercomputers • 1 MFLOPS • 1964, CDC 6600 • ILP (Pipeline parallel)、Out of order、Score board • 1 GFLOPS • 1984, Cray XMP/4 • Vector architecture、SMP • 1 TFLOPS • 1997, ASCI-Red • Cluster computer of many MPU • 1 PFLOPS • 2008, IBM Roadrunner (Cell base) • GPGPU • On Chip Multi CPU, Huge parallelism

  5. Development of Supercomputers(1964-2018) CDC6600 CDC IBM 360/67 Vector CDC7600 SIMD GPU Cluster Distributed Memory Shared Memory TI 1970 ASC STAR-100 Burroghs Fastest system at one time FPS Cray ILLIAC IV AP-120B Research/Special system Fujitsu Cray-1 CMU Hitachi C.mmp 230-75APU Denelcor ICL M180IAP HEP Cyber205 DAP 1980 Goodyear Cray-XMP Intel Cray Computer VP-200 S-810 MPP Cosmic Cube NEC Ncube Thinking Machines Cydrome Allient Encore SX-2 VP-400 CM-1 Ncube Cray-2 iPSC Multimax FX-8 Multiflow CMU WARP Sequent S-820 CM-2 FX800 ETA-10 IBM Cray-YMP Maspar RP3 CS-1 FX2800 VP-2600 1990 Fujitsu SX-3 MP-1 T.S.Delta KSR-1 QCD-PAX AP1000 Cray-C90 MP-2 CM-5 SUN SGI NWT CS6400 S-3800 Paragon SP1 Challenge Cray-3 AP3000 Cray T3D CS-2 Hitachi SX-4 Cray-T90 VPP700 Intel T3E SR-2201 Tera/Cray ASCI RED Origin2000 MMX SP2 Starfire MTA SX-5 U of Tokyo Cray-SV1 SR8000 2000 Sony/IBM VPP5000 SSE SP3 PrimePower Origin3800 PS2EE GRAPE-6 ASCI White QCDSP Regatta SUN Fire ES SSE2 GPGPU Cray-X1 SX-6 HPC2500 SSE3 SR11000 XT3 Altix G80 BG/L CELL Cray X2 SX-8 FX1 GTX280 XT5 Road runner BG/P SR16000 Fermi AVX XT6 SX-9 IA Clusters GRAPE-DR GPGPU K 2010 Tianhe1A Xeon/Phi Power7 Blue Waters 星雲 XK6 BlueWater BG/Q FX10 SR16000M1 XK7 Tianhe2 XC30 PEZYSC1 SX-ACE XC40 FX100 SUNWAY Xeon/Phi PEZYSC2 IBM XC50 XC50 Today POWER9 +GV100 SX-Aurora PEZYSC3 2020 PostK

  6. History of Fastest supercomputers (1) Name Year to start LinpackperformancePeak performance • UNIVAC LARC 1960 (0.16Mflops) • IBM STRECH 1961 (0.3Mflops) • CDC-6600 1964 0.5Mflops (3 Mflops) *N=100Linpack • CDC-7600 1969 3.3Mflops (10 Mflops)*N=100 Linpack • TI ASC 1972 ~30 Mflops (64 Mflops) • ILLIAC IV 1975 ~40 Mflops (150 Mflops) • Cray-1 1976 110 Mflops (160 Mflops)*N=1000 Linpack • Cray-XMP4 1982 714 Mflops (800 Mflops) • SX-2 1985 885 Mflops (1.3Gflops) • Cray-2 1985 1.4Gflops (1.9Gflops) • CM-2 1987 2.4Gflops (5 Gflops) (ETA-10 1988 496 Mflops (9.1 G (single)/4.6G(double) /8proc 並列動作は不動 *N=1000 Linpack, 1 proc. 7ns • Every fastest supercomputer has its interesting drama. • Behind fastest supercomputers, there are numerous supercomputers that fail to become the world fastest

  7. History of Fastest supercomputers (2) Name Year to start LinpackperformancePeak performance • SX-3/44R 1990 23.2Gflops (25.6Gflops) • CM-5 1993 60 Gflops (131 Gflops) • Fujitsu NWT 1993 124 Gflops (236 Gflops) • Intel Paragon XP 1994 143 Gflops (184 Gflops) • Fujitsu NWT 1994 170 Gflops (236 Gflops) • Hitachi SR-2201 1996 220 Gflops (307 Gflops) • Hitachi CP-PACS 1996 368 Gflops (614 Gflops) • Intel ASCI RED 1997 1.1Tflops ( 1.5Tflops) • IBM ASCI White 2000 4.9Tflops (12.4Tflops) • NEC ES 2002 35 Tflops (40.1Tflops) • IBM BlueGene/L 2004 71 Tflops (92 Tflops) • IBM Roadrunner 2008 1.0Pflops (1.4 Pflops) • Cray XT-5 2009 1.8Pflops (2.3 Pflops) • Tianhe-1A 2010 2.5 Pflops (4.7 Pflops) • K-computer 2011 10.5 Pflops (11 Pflops) • BlueGene/Q 2012 16 Pflops (20 Pflops) • Cray XK7 2012 17.6 Pflops (27 Pflops) • Tianhe-2 2013 33.9 Pflops (55 Pflops) • Sunway 2016 93.9 Pflops (125 Pflops) • IBMAC922+NVIDIAV100 2018 122.3 Pflops (188 Pflops)

  8. Various Benchmarks • SPEC CPUint • We cannot submit papers without SPEC CPUint • Even Dhrystone is useful • Linpack • HPC linpack is not a bad benchmarks • Good for time-line comparison • HPCC • Too many result figures • Reduncant • DGEMM, FFT, Stream are useful • HPCG • Today’s topic

  9. HPC Benchmarks • How can I compare apples and oranges? Distributed Memory Shared Memory Vectors GPGPUs SIMD

  10. Simplest history of supercomputers • 1 MFLOPS • 1964, CDC 6600 • ILP (Pipeline parallel)、Out of order、Score board • 1 GFLOPS • 1984, Cray XMP/4 • Vector architecture、SMP • 1 TFLOPS • 1997, ASCI-Red • Cluster computer of many MPU • 1 PFLOPS • 2008, IBM Roadrunner (Cell base) • GPGPU • On Chip Multi CPU, Huge parallelism • 1 EFLOPS • 2022? • Special purpose accelerator?3D semiconductor? • 1 Zflops • 2038?? • Billion core?More specialized accelerator? 20 years 13 yesars 11 yesars 14 years 16years?

  11. Today’s Topics • What is the best benchmarks for • Exa flops developments • Zetta flops development • Comparison to Quantum computers

  12. Why XXX/Rpeak important Ratio improves when CPU has less FPUs Now area for FPU is not a major factor of CPU

  13. Return to simplicity • Weighted means of • DGEMM • STREAM • FFT • Selection of Weight is the problem OR

  14. Return to simplicity • Weighted means of • Linpack • HPCG • FFT

  15. Purpose of Benchmark software • Characterization of the system • Balance of system components • Proof of improvements • Evidences for purchase decisions • Performance / Cost • Performance / Power • Time-line comparison • Single number v.s. Multiple number

More Related