1 / 50

Kirk W. Cameron SCAPE Laboratory Virginia Tech

The past, present, and future of Green Computing. Kirk W. Cameron SCAPE Laboratory Virginia Tech. Enough About Me. Associate Professor Virginia Tech Co-founder Green500 Co-founder MiserWare Founding Member SpecPower Consultant for EPA Energy Star for Servers

lalo
Download Presentation

Kirk W. Cameron SCAPE Laboratory Virginia Tech

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The past, present, and future ofGreen Computing Kirk W. Cameron SCAPE Laboratory Virginia Tech SCAPE Laboratory Confidential

  2. Enough About Me • Associate Professor Virginia Tech • Co-founder Green500 • Co-founder MiserWare • Founding Member SpecPower • Consultant for EPA Energy Star for Servers • IEEE Computer “Green IT” Columnist • Over $4M Federally funded “Green” research • SystemG Supercomputer

  3. What is SCAPE? • Scalable Performance Laboratory • Founded 2001 by Cameron • Vision • Improve efficiency of high-end systems • Approach • Exploit/create technologies for high-end systems • Conduct quality research to solve important problems • When appropriate, commercialize technologies • Educate and train next generation HPC CS researchers SCAPE Laboratory Confidential

  4. The Big Picture (Today) • Past: Challenges • Need to measure and correlate power data • Save energy while maintaining performance • Present • Software/hardware infrastructure for power measurement • Intelligent Power Management (CPU Miser, Memory Miser) • Integration with other toolkits (PAPI, Prophesy) • Future: Research + Commercialization • Management Infra-Structure for Energy Reduction • MiserWare, Inc. • Holistic Power Management

  5. 1882 - 2001

  6. Prehistory 1882 - 2001 • Embedded systems • General Purpose Microarchitecture • Circa 1999 power becomes disruptive technology • Moore’s Law + Clock Frequency Arms Race • Simulators emerge (e.g. Princeton’s Wattch) • Related work continues today (CMPs, SMT, etc)

  7. 2002

  8. Server Power 2002 • IBM Austin • Energy-aware commercial servers [Keller et al] • LANL • Green Destiny [Feng et al] • Observations • IBM targets commercial apps • Feng et al achieve power savings in exchange for performance loss

  9. Earth Simulator12 Megawatts $800,000 per yearper megawatt! High-speed train10 Megawatts $4,000/yr $12,000/yr $680,000/yr $8 million/yr $9.6 million/yr Intel ASCI Red.850 Megawatts Residential A/C.015 Megawatts Conventional Power Plant300 Megawatts TM CM-5 .005 Megawatts HPC Power 2002 • My observations • Power will become disruptive to HPC • Laptops outselling PC’s • Commercial power-aware not appropriate for HPC

  10. HPPAC Emerges 2002 • SCAPE Project • High-performance, power-aware computing • Two initial goals • Measurement tools • Power/energy savings • Big Goals…no funding (risk all startup funds)

  11. 2003 - 2004

  12. Cluster Power 2003 - 2004 • IBM Austin • On evaluating request-distribution schemes for saving energy in server clusters, ISPASS ‘03 [Lefurgy et al] • Improving Server Performance on Trans Processing Workloads by Enhanced Data Placement. SBAC-PAD ’04 [Rubio et al] • Rutgers • Energy conservation techniques for disk array-based servers. ICS ’04 [Bianchini et al] • SCAPE • High-performance, power-aware computing, SC04 • Power measurement + power/energy savings

  13. Power/Energy Profiling Data Baytech Management unit Baytech Powerstrip Multi-meter BaytechPower Strip AC AC Power from outlet Multi-meter Multi-meter Multi-meter DC Singlenode Data Log DC Power from power supply Multi-metercontrol Data Analysis MM Thread MM Thread MM Thread Multi-meter Control Thread Data Repository DVScontrol Applications Microbenchmarks PowerPack libraries (profile/control) DVS Thread DVS Thread DVS Thread DVS Control Thread PowerPack Measurement 2003 - 2004 Scalable, synchronized, and accurate. Hardware power/energy profiling Data collection High-performancePower-aware Cluster Software power/energy control

  14. After frying multiple components…

  15. PowerPack Framework(DC Power Profiling) If node .eq. root then call pmeter_init (xmhost,xmport) call pmeter_log (pmlog,NEW_LOG)endif <CODE SEGMENT> If node .eq. root then call pmeter_start_session(pm_label)endif <CODE SEGMENT> If node .eq. root then call pmeter_pause() call pmeter_log(pmlog,CLOSE_LOG) call pmeter_finalize()endif Multi-meters + 32-node Beowulf

  16. Power Profiles – Single Node • CPU is largest consumer of power typically (under load)

  17. Power Profiles – Single Node Power Consumption for Various Workloads CPU-bound memory-bound network-bound disk-bound

  18. NAS PB FT – Performance Profiling compute reduce(comm) compute all-to-all(comm) About 50% time spent in communications.

  19. Power profiles reflect performance profiles.

  20. One FFT Iteration SCAPE Laboratory Confidential

  21. 2005 - present

  22. Intuition confirmed 2005 - Present

  23. HPPAC Tool Progress 2005 - Present • PowerPack • Modularized PowerPack and SysteMISER • Extended analytics for applicability • Extended to support thermals • SysteMISER • Improved analytics to weigh tradeoffs at runtime • Automated cluster-wide, DVS scheduling • Support for automated power-aware memory

  24. Predicting CPU Power 2005 - Present

  25. Predicting Memory Power 2005 - Present

  26. Correlating Thermals BT 2005 - Present

  27. Correlating Thermals MG 2005 - Present SCAPE Laboratory Confidential

  28. Tempest Results FT 2005 - Present

  29. SysteMISER 2005 - Present • Our software approach to reduce energy • Management Infrastructure for Energy Reduction • Power/performance • measurement • prediction • control The Heat Miser.

  30. Power-aware DVS scheduling strategies 2005 - Present • CPUSPEED Daemon • [example]$ start_cpuspeed • [example]$ mpirun –np 16 ft.B.16 • Internal scheduling • MPI_Init(); • <CODE SEGMENT> • setspeed(600); • <CODE SEGMENT> • setspeed(1400); • <CODE SEGMENT> • MPI_Finalize(); • External Scheduling • [example]$ psetcpuspeed 600 • [example]$ mpirun –np 16 ft.B.16 NEMO & PowerPack Framework for saving energy

  31. Normalized Energy and Delay with CPU MISER for FT.C.8 normalized delay 1.20 normalized energy 1.00 0.80 0.60 0.40 0.20 0.00 auto 600 800 1000 1200 1400 CPU MISER CPU MISER Scheduling (FT) 2005 - Present 36% energy savings, less than 1% performance loss See SC2004, SC2005 publications.

  32. Where else can we save energy? 2005 - Present • Processor – DVS • Where everyone starts. • NIC • Very small portion of systems power • Disk • A good choice (our future work) • Power-supply • A very good choice (for a EE or ME) • Memory • Only 20-30% of system power, but…

  33. The Power of Memory 2005 - Present

  34. Memory Management Policies 2005 - Present Dynamic Default Static Memory MISER = Page Allocation Shaping + Allocation Prediction + Dynamic Control

  35. Memory MISER Evaluation of Prediction and Control 2005 - Present Prediction/control looks good, but are we guaranteeing performance?

  36. Memory MISER Evaluation of Prediction and Control 2005 - Present Stable, accurate prediction using PID controller. But, what about big (capacity) spikes?

  37. Memory MISER Evaluation of Prediction and Control 2005 - Present Memory MISER guarantees performance in “worst” conditions.

  38. Memory MISER Evaluation Energy Reduction 2005 - Present 30% total system energy savings,less than 1% performance loss

  39. Present - 2012

  40. SystemG Supercomputer @ VT

  41. SystemG Stats • 325 Mac Pro Computer nodes, each with two 4-core 2.8 gigahertz (GHZ) Intel Xeon Processors. • Each node has eight gigabytes (GB) random access memory (RAM). Each core has 6 MB cache. • Mellanox 40Gb/s end-to-end InfiniBand adapters and switches. • LINPACK result: 22.8 TFLOPS (trillion operations per sec) • Over 10,000 power and thermal sensors • Variable power modes: DVFS control (2.4 and 2.8 GHZ), Fan-Speed control, Concurrency throttling,etc. (Check: /sys/devices/system/cpu/cpuX/Scaling_avaliable_frequencies.) • Intelligent Power Distribution Unit: Dominion PX (remotely control the servers and network devices. Also monitor current, voltage, power, and temperature through Raritan’s KVM switches and secure Console Servers.)

  42. Deployment Details * 13 racks total, 24 nodes on each rack and 8 nodes on each layer. * 5 PDUs per rack. Raritan PDU Model DPCS12-20. Each single PUD in SystemG has an unique IP address and Users can use IPMI to access and retrieve information from the PDUS and also control them such as remotely shuting down and restarting machines, recording system AC power, etc. * There are two types of switch: 1) Ethernet Switch: 1 Gb/sec Ethernet switch. 36 nodes share one Ethernet switch. 2) InfiniBand switch: 40 Gb/sec InfiniBand switch. 24 nodes (which is one rack) share one IB switch.

  43. Data collection system and Labview Sample diagram and corresponding front panel from Labview:

  44. A Power Profile for HPCC benchmark suite

  45. Published Papers And Useful Links Papers: 1. Rong  Ge, Xizhou Feng, Shuaiwen Song, Hung-Ching Chang, Dong Li, Kirk W. Cameron, PowerPack: Energy profiling and analysis of High-Performance Systems and Applications, IEEE Transactions on Parallel and Distributed Systems, Apr. 2009. 2. Shuaiwen Song, Rong Ge, Xizhou Feng, Kirk W. Cameron, Energy Profiling and Analysis of the HPC Challenge Benchmarks, The International Journal of High Performance Computing Applications, Vol. 23, No. 3, 265-276 (2009) NI system set details: http://sine.ni.com/nips/cds/view/p/lang/en/nid/202545 http://sine.ni.com/nips/cds/view/p/lang/en/nid/202571

  46. The future… Present - 2012 • PowerPack • Streaming sensor data from any source • PAPI Integration • Correlated to various systems and applications • Prophesy Integration • Analytics to provide unified interface • SysteMISER • Study effects of power-aware disks and NICs • Study effects of emergent architectures (CMT, SMT, etc) • Coschedule power modes for energy savings

  47. Outreach • See http://green500.org • See http://thegreengrid.org • See http://www.spec.org/specpower/ • See http://hppac.cs.vt.edu SCAPE Laboratory Confidential

  48. Acknowledgements • My SCAPE Team • Dr. Xizhou Feng (PhD 2006) • Dr. Rong Ge (PhD 2008) • Dr. Matt Tolentino (PhD 2009) • Mr. Dong Li (PhD Student, exp 2010) • Mr. Song Shuaiwen (PhD Student, exp 2010) • Mr. Chun-Yi Su, Mr. Hung-Ching Chang • Funding Sources • National Science Foundation (CISE: CCF, CNS) • Department of Energy (SC) • Intel

  49. Thank you very much. http://scape.cs.vt.edu cameron@cs.vt.edu Thanks to our sponsors: NSF (Career, CCF, CNS), DOE (SC), Intel

More Related