1 / 16

CS 7960-4 Lecture 24

CS 7960-4 Lecture 24. Exceeding the Dataflow Limit via Value Prediction M.H. Lipasti, J.P. Shen Proceedings of MICRO-29 December 1996. Dependences. Value Locality. Avg. accuracy 49%. Avg. accuracy 61%. Value Predictor. CT Design. VP Microarchitecture.

ayita
Download Presentation

CS 7960-4 Lecture 24

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 7960-4 Lecture 24 Exceeding the Dataflow Limit via Value Prediction M.H. Lipasti, J.P. Shen Proceedings of MICRO-29 December 1996

  2. Dependences

  3. Value Locality Avg. accuracy 49% Avg. accuracy 61%

  4. Value Predictor

  5. CT Design

  6. VP Microarchitecture • Value prediction happens at dispatch • Results are immediately bypassed to dependents, • but predicted instrs also go thru the pipeline • Dependents remain in issueq until verification • Predicted and verified results have to be broadcast • to the issue queue

  7. Verifier • Similarities with pre-execution – a speculative • thread and a verifier thread • Dependent instructions can produce results • instantaneously, but verifier executes in sequence • Verification takes a cycle – can slow the • verification thread and slows the squashing process • Verification increases contention for resources and • issue queue occupancy

  8. Dependent Instructions Completed: t+1 Verified: t+9 Completed: t+8 Completed: t+9 Completed: t+9 Completed: t+1 Verified: t+10 Completed: t+4 Verified: t+10 Completed: t+1 V-completed: t+8 Verified: t+9 No prediction Correct prediction Completed: t+1 Verified: t+10 Completed: t+4 Verified: t+10 V-completed: t+13 Incorrect prediction

  9. Configurations

  10. Results • Bullet

  11. Infinite Processor Model • Bullet Limitations: branch prediction, fetch, store bandwidth, verifier thread

  12. Efficient Use of Transistors • Bullet

  13. Future Work • Better predictions, hit rates, strides • Value prediction for critical instructions/high • confidence predictions • Speculation along multiple paths in the value space • Value prediction for stores

  14. Power Implications • Increased activity  increased power consumption • Higher performance  potentially lower energy • (reduced clock distribution energy)

  15. Next Class’ Paper • “Energy Efficient Co-Adaptive Instruction Fetch • and Issue”, A. Buyuktosunoglu, T. Karkhanis, • D. H. Albonesi, P. Bose, Proceedings of ISCA-30, • June, 2003

  16. Title • Bullet

More Related