1 / 25

Reduced Energy Decoding of MPEG Streams

Reduced Energy Decoding of MPEG Streams. Malena Mesarina, HP Labs/UCLA CS Dept Yoshio Turner, HP Labs. System Environment. Portable client – limited battery life Multimedia server – ample compute/storage Application – stored media streaming with MPEG decoding performed by the client.

Download Presentation

Reduced Energy Decoding of MPEG Streams

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reduced Energy Decoding of MPEG Streams Malena Mesarina, HP Labs/UCLA CS Dept Yoshio Turner, HP Labs Multimedia Computing and Networking Jan 2002

  2. System Environment • Portable client – limited battery life • Multimedia server – ample compute/storage • Application – stored media streaming with MPEG decoding performed by the client

  3. Problem • Tradeoff – client energy consumption increases with media stream quality • User should be able to choose the operating point to balance quality and battery life • Goal: improve the energy/QoS tradeoff by reducing the energy consumption required for each level of media quality

  4. Approach • Idea: exploit the ample resources of servers to improve client battery life • Client supports a discrete set of voltages and clock frequencies •  voltage   speed,  energy consumption • Dynamic Voltage Scaling – DVS • Server pre-processes (offline) stored media • Computes frame decoding order • Assigns voltage/frequency per frame • Transmits schedule to client for DVS execution

  5. Contributions • New DVS scheduling algorithm • Minimizes CPU energy consumption • Satisfies timing constraints • Satisfies buffering constraints • Quantification of the energy-QoS tradeoff • Evaluation of the impact of DVS and client design parameters (processor speed, buffering) on the energy-QoS tradeoff

  6. Audio Display Buffer Decoding order: I0 P1 B2 B3 P2 ... decoder Video Display Buffer Input fifo Past I0 B3 Future P1 Reference Buffers Decoding Hardware Organization

  7. Naïve Scheduling is Bad Audio Video Voltage Voltage Naive scheduling = EDF task order + greedy voltage assignment.

  8. DVS Scheduling Algorithm • Goal : minimize energy consumption • For a uni-processor client find voltage-frequency settings per frame and interleaved order of decoding frames • Subject to the following constraints • Frames within a stream are in a fixed decoding order • Frame decode interdependence (I-, P-, B-frames) • Display rates for video (33 fps) and audio (44 Khz) • Audio/Video synchronization: 80 ms • Limited client display buffer capacity

  9. DVS Scheduling Algorithm (continued) • Approach: dynamic programming • Find the energy optimal subschedule that completes the first i video and j audio tasks by time t, over search space (i,j,t). Report the best results over all possible t for the full media. • Search space is reduced by exploiting our knowledge of the constraints

  10. Main Challenges • Frame decoding inter-dependencies: B-frames depend on future P-frames • Decoding order not equal to display order • Construct a mapping function from frame decode number to frame display number in order to compute correct deadlines • Limited buffer capacity • Algorithm must have overflow avoidance mechanism • Multiple voltage levels and possible frame decoding orders • Intractable search space, pruning necessary

  11. Fixed Display Buffer Capacity • Overflow prevention: Translate buffer constraints to timing constraints • Assign minimum decoding start times to tasks Suppose display buffer is full (contains previously decoded frames) Earliest time to enqueue (min start time) for B5 is when head frame I0 leaves buffer to be displayed The head frame I0 is identified using the frame display order and buffer capacity

  12. Key to Tractable Execution • Limit the number of combinations of (i, j, t) • Limit the range of subschedule completion times t (time windows) • Limit the combinations of (i,j) by detecting “dead-end” subschedules  small number of (i,j) pairs, each with small time window

  13. Limiting Completion Times: Time Window • A window represents possible completion times of i video and j audio tasks. • Lower Bound, Tmin(i,j): earliest time when the last task in both streams can complete • Upper Bound, Tmax(i,j): latest time when the last task in both streams can complete • Tmax – Tmin ~ (1/frame_rate) * buffersize

  14. (i + 1)-th video frame i-th video frame tmin[i+1,j] tmax[i,j] tmax[i+1,j] tmin[i,j] Time Window Example Time

  15. Video frame in display Video Audio 11 10 12 Audio frame in display 13 14 10 Only some (i,j) subschedules lead to complete schedule N = #frames B = buffer size Ts = 1/frame rate • Scheduling (i,j) = (10,14) POSSIBLE BUT • Scheduling (i,j) = (10,15) is NOT POSSIBLE because AUDIO BUFFER OVERFLOWS! • (i,j) is limited by the buffer size Algorithm Complexity: O(N *B)  O(B* TS) = O(N * B 2 * Ts)

  16. Performance Evaluation:Energy vs QoS Exploration • Variability in frame execution times • Potential for energy reduction? • Energy savings vs picture quality • For what range of quality is DVS helpful ? • How much improvement is in that range ? • Impact of client design parameters on energy vs QoS • How does processor speed change tradeoff ? • Will extra buffering ease schedulability? Reduce energy?

  17. Frame Execution Times

  18. Energy-QoS tradeoff: Fast Processor + Fixed Buffer Size Pentium 3 (1.9V@500MHz, 1.4V@316 MHz) Display buffers: 2 for video, 2 for audio Scale factor = frame pixels/max frame pixels 17000 16000 50 47% 15000 14000 dvs hi volt 13000 40 lo volt 12000 11000 10000 30 9000 8000 7000 20 19%  6000 5000 4000 10 3000 2000 1000 0 0 0.7 0.75 0.8 0.85 0.9 0.95 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 scale factor scale factor • Energy improvement over range of high resolution scale factors, 0.6 to 1

  19. Energy-QoS tradeoff: Slow Processor + Variable Buffer Size • Pentium 2: (1.7V@300MHz, 1.4V@225MHz) • Variable buffering : (video,audio) (1,1), (3,3) (6,3) • Increasing buffering does not improve energy significantly • Extra buffers enable decoding of higher QoS video

  20. Summary and Conclusions • Offline algorithm finds a low energy schedule that respects: • Timing constraints (display rate, synchronization) • Limited memory at client • DVS significantly reduces energy consumption • Increasing buffer size • No impact on energy but • Enables higher video quality

  21. Future Work • Online scheduling • Offline schedule represents lower bound on energy • Exploration of other tradeoff media parameters (frame rate, display brightness) • Implementation with progressive coding schemes (JPEG2000)

  22. Experimental Setup • Fixed voltage/frequency processors: P3 and P2 • Computed time/energy per frame at fixed voltage • Extrapolated time/energy per frame at other operational core voltages • Assumptions: • Frequency is inverse proportional to gate delay • Cycles/frame remains constant for different frequencies • Power dissipation constant for a given voltage setting

  23. Extrapolation Example • Given: Vhi, Fhi, , Thi, Phi • Flo = Fhi * hi/lo = Fhi * Vhi/(Vhi-Vt)2 (1) • Tlo = cycles/Flo = Fhi * Thi/Flo (2) • Plo = Phi * (Flo* Vlo2)/(Fhi * Vhi2) • Elo = Plo * Tlo (3)

  24. Related Work • Problem we address • Real-time scheduling of non-preemptable tasks with precedence constraints • Other real-time schedulers treat different cases • [1] Liu and Layland, “Scheduling algorithms for multiprogramming in a hard-real-time environment” • [2] Yao et al. “A scheduling model for reduced CPU energy” • No precedence constraints and preemptable tasks • [3] Hong et al. “Power optimization of variable voltage core-based systems” • Heuristics for non-preemptable tasks but no precedence constraints

  25. Frame Interdependence • Map frame number i in decoding order to frame number d(i) in display order

More Related