1 / 23

Virtual Private Caches

Virtual Private Caches. ISCA’07 Kyle J. Nesbit, James Laudon, James E. Smith Presenter: Yan Li. CMP-based System. Chip-level Multiprocessor multiple processor cores are implemented into a single chip Multithreading support. Intel Core 2 Duo E6750. CMP-based System (2). Resource sharing

Download Presentation

Virtual Private Caches

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Virtual Private Caches ISCA’07 Kyle J. Nesbit, James Laudon, James E. Smith Presenter: Yan Li

  2. CMP-based System • Chip-level Multiprocessor • multiple processor cores are implemented into a single chip • Multithreading support Intel Core 2 Duo E6750

  3. CMP-based System (2) • Resource sharing • Cache capacity/bandwidth, main memory…… • Pros: Higher resource utilization • Cons:Inter-thread interference • Unpredictable performance / no QoS! • Many applications running on CMP-based systems require Quality of Service

  4. Quality of Service • QoS are required by many applications: • Soft real-time applications • video games • Find-grain parallel applications • Scheduling & synchronization • Server consolidation • Hosting services • QoS objectives in CMP-based system • provide an upper bound on thread execution time regardless of other thread activity

  5. Outline • Introduction • QoS Framework • Virtual Private Cache - VPC Arbiter • Virtual Private Cache - Capacity Manager • Performance Evaluation • Conclusions

  6. Overview of VPM • Virtual Private Machine: A set of allocated hardware resources • Processors, bandwidth, memory spaces… • Each thread is allocated a share of hardware resource based on policies • Applications & system software • Hardware mechanism enforces allocated resources

  7. System hardware VPM

  8. Objectives of VPM • Performance Isolation • thread performance is as good as on real private machine having same resources • Dynamic distribution of excess resources • Unallocated resources • Allocated but not used resources

  9. Virtual Private Cache • Microarchitecture-level mechanism • Main components • VPC Arbiter: tag & data array bandwidth sharing • VPC Capacity Manager: cache capacity sharing • Advantages • Performance isolation • Improved utilization

  10. Outline • Introduction • QoS Framework • Virtual Private Cache - VPC Arbiter • Virtual Private Cache - Capacity Manager • Performance Evaluation • Conclusions

  11. VPC Arbiter - Implementation(1) • Each data & tag array has an arbiter • Each arbiter has • FIFO buffer for each thread: • 1 clock register R.clk: determine arrival time • R.Li & R.Si for thread i: virtual service/start time

  12. VPC Arbiter - Implementation(2) • R.Li: virtual service time of a request from thread i • L: latency of shared cache; : thread i’s fraction of resources • R.Si: virtual start time of the next request of thread i • Time that the resource is available for the next request of thread i

  13. Fair Queuing Scheduling • Request Arrival: • Arbiter Calculation of virtual finish time: • Arbiter Selection: • select the request with the earliest Fi

  14. Arbiter Fairness Policy • Excess bandwidth is distributed to threads that has received the least excess bandwidth in the past

  15. Outline • Introduction • QoS Framework • Virtual Private Cache - VPC Arbiter • Virtual Private Cache - Capacity Manager • Performance Evaluation • Conclusions

  16. Implementation • Set associative replacement policy • Each thread receives • same number of sets as the shared cache • at least <ways in the shared cache> • Replacement policy • LRU line owned by thread i, such that thread i owns more than ways • LRU line owned by the thread that requesting the replacement

  17. Outline • Introduction • QoS Framework • Virtual Private Cache - VPC Arbiter • Virtual Private Cache - Capacity Manager • Performance Evaluation • Conclusions

  18. Experiment Setup • Two microbenchmarks to stress performance isolation feature • Loads: load operations with continuous read hits • Stores: store operations with continuous write hits • SPEC CPU2000 benchmark suite • QoS performance metrics • IPC • Data array utilization

  19. Other Arbiter • Read over Write • Prioritize read over write • Read over Write First Come First Service • Prioritize read over write • Prioritize oldest requests • Round Robin • Interleave requests uniformly and consistently

  20. Microbenchmark

  21. SPEC

  22. Conclusions • VPC: hardware mechanism of VPM QoS framework • VPC arbiter & capacity manager • VPC can achieve global QoS objectives • Issues: • Local QoS objectives assumes performance monotonicity

  23. Thank You! & Questions?

More Related