150 likes | 166 Views
Thermal-Scheduling For Ultra Low Power Mobile Microprocessor. George Cai 1 Chee How Lim 1 W. Robert Daasch 2 Intel Corporation 1 Integrated Circuit Design and Test Laboratory PSU 2. Presentation Outline. Mobile CPU Power Efficiency With Demanded Performance
E N D
Thermal-Scheduling For Ultra Low Power Mobile Microprocessor George Cai1 Chee How Lim 1 W. Robert Daasch2 Intel Corporation 1 Integrated Circuit Design and Test Laboratory PSU 2
Presentation Outline • Mobile CPU Power Efficiency With Demanded Performance • Thermal Scheduling For Mobile Microprocessor • Power Constrained Performance • Observations/Conclusions
Primary OOP EX FE DE RF Secondary DE IOP Ultra Low Power Mobile Microprocessor • Primary pipeline: maximal performance, complex pipeline structure • Second pipeline: Minimum power and energy consumption, very simple in order structure and target mobile anywhere-anytime applications. • Transparent to OS and applications • Maximal utilizing on die clock/power gating for energy saving Majority mobile apps with performance requirements Text email, caller-id, reminder and other none high performance w/ anywhere-anytime requested apps
Low Energy Consumption With Providing Suitable Performance Is Key For “Anywhere And Anytime” • Must be compatible with exist OS and platform • Must have active leakage power control • Must meet the real time telecom application requirements Stock/Urgent Messages All urgent message And important news News headline Email titles Calendar reminder Pages/voice message Alert Interactive command and reply Stock Update
Primary OOP EX FE DE RF Secondary DE IOP Runtime Thermal Scheduling Capability • When thermal threshold is exceeded, the pipeline clusters will service instructions in alternating manner: cool the “hot” pipeline by clock/power-gating & the “cold” pipeline sustains processor operations • Flexible selecting the threshold point, the energy-delay product, performance, and reliability of the processor can be enhanced
Thermal Effects: Leakage Trend • Active leakage power reduction will be significant role for total power reduction • Thermal control is important for low energy consumption for mobile CPU Derived from F. Pollack’s Micro-32 Keynote Presentation, 1999
T1 TL T1 < TH & T2 TL T1> TL Tmax T1 TL T1 TH || T2> TL TH Temperature (C) T1 TL T1 TH TL T1> TL T1> TL Ta tcycle & T2 TH tcool theat & T2< TH Time (s) S1: Normal Operation (Primary Pipeline) S2: Stall Fetch & Clear Pipeline S3: Alternate Operation (Secondary Pipeline) S4: Disable Clock or Scale F-V S4 S3 S2 S1 Example of Scheduling Algorithm TS1 TS2
Dynamic Frequency Scaling Dynamic Clock Disabling/Throttling Enhance Effectiveness Of Other Power Control Techniques
Thermal Effects on Power • Divide total power into two components: dynamic and leakage power
Thermal Effects on Energy • Using power per frequency (W/MHz) metric as proxy for energy
~15% ~30% Architecture-LevelPower-Performance Tradeoff • For wide-superscalar processors, performance impact of pipeline scaling is smaller than global clock throttling or frequency scaling
Comparative Outcomes: Energy Metric • Simulation Conditions (500 million instructions; TL = 55C) • Stop Clock Control: Toggle between Fmax and 0 MHz • Voltage/Freq Scaling : Toggle between Fmax and 0.9/0.8/0.6 Fmax • Thermal Scheduling : Toggle between Primary and 2nd Pipelines Conservative: TH = 70C Aggressive: TH = 60C Energy Consumption of Aggressive Control Energy Consumption of Conservative Control 14.000 16.000 12.000 14.000 12.000 10.000 10.000 Energy (J) 8.000 Energy (J) 8.000 Benchmarks Benchmarks 6.000 6.000 PERL 4.000 4.000 PERL GCC 2.000 GCC 2.000 LI 0.000 LI M88KSIM 0.000 M88KSIM M88KSIM Clk Gating M88KSIM LI F-V Scaling LI Clk gating Thermal Scheduling V-F scaling GCC GCC Thermal scheduling Thermal Control Thermal Control PERL PERL Techniques Techniques
Comparative Outcomes: Energy-CPU Time Metric Total Energy x CPU Time
Pros and Cons • Advantages • Limits power/energy upper bound & prevents thermal runaway • Pipeline tuned for either performance or ultra low power • Existing OS and application compatible • Performance penalty for engaging/disengaging control is small (architecture event) • Supports low-power anywhere-anytime of mobile computing • Non-timing critical tasks • Real-time application that requires more predictable output • Concerns • i/t during pipeline switch • Real-Register File may require extra dedicated ports • Bypass bus may have additional loading