1 / 15

Lecture 11: Pipeline Hazards II

Lecture 11: Pipeline Hazards II. Last Time: Data Hazards (intro) Today Reducing impact of data hazards Control Hazards How to speed up pipelines. Data Hazards With Bypassing. Cycle. ADD R1, R2, R3 ADD R4, R1, R5 SUB R5, R1, R6 XOR R7, R8, R1. F. R. X. M. W. R1 computed. R1 used.

zena
Download Presentation

Lecture 11: Pipeline Hazards II

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 11: Pipeline Hazards II Last Time: Data Hazards (intro) Today Reducing impact of data hazards Control Hazards How to speed up pipelines

  2. Data Hazards With Bypassing Cycle ADD R1, R2, R3 ADD R4, R1, R5SUB R5, R1, R6 XOR R7, R8, R1 F R X M W R1 computed R1 used F R X M W ADD Instruction F R X M W SUB F R X M W XOR

  3. Memory Data Hazards Cycle LW R1, R2 ADD R4, R1, R5SUB R7, R8, R9 F R X M W R1 loaded R1 used F R X M W ADD Instruction F R X M W SUB

  4. Instruction Scheduling (Load Delay Slots) Cycle LW R1, R2 SUB R7, R8, R9 ADD R4, R1, R5 F R X M W R1 loaded F R X M W SUB Instruction R1 used F R X M W ADD

  5. Control Hazards (Unconditional Branch) Cycle F R X M W Branch Test and Destination F R X M W Instruction Need Destination Here JR R25 ... XX: ADD ...

  6. Reducing Control Hazards Cycle F R X M W Branch Test and Destination F R X M W Instruction Need Destination Here JR R25 ... XX: ADD ... Move test logic into R stage

  7. Branch Delay Slots - A Familiar Example .LL5: sll %o1,2,%g3 // i*4 bytes ld [%o0],%g2 // load A[i] add %o1,1,%o1 // i++ ld [%o2+%g3],%g3 // load B[i] cmp %o1,99 // at end? add %g2,%g3,%g2 // A[i]+B[i] st %g2,[%o0] // store A[i] ble .LL5 // branch back add %o0,4,%o0 // A++ retl // return from pgm sub %sp,-912,%sp // deallocate stack space

  8. Branch Delay Slots • Since we need to have a dead cycle anyway, let’s put a useful instruction there • Advantage: • Do more useful work • Disadvantage: • Exposes microarchitecture to ISA ADD R2,R3,R4BNEZ R5,_loop NOP BNEZ R5,_loop ADD R2,R3,R4

  9. Conservatively, the pipeline waits until the branch target is computed before fetching the next instruction. Alternatively, we can speculate which direction and to what address the branch will go. Need to confirm speculation and back up later. F F F R R R X X X M M M W W W F F R R X X M M W W Control Hazards F F R X M W

  10. Predict Not Taken

  11. Example Speculative Conditional Branch BNEZ R1, LOOP ADD R2, R3, R4 SUB R5, R6, R7 IP IP +4 RW A IR Reg File I-Mem D-Mem C E A A DO DO B D DI IR IR IR

  12. Speculative Conditional Branch (Diagram) Cycle BNEZ R1, LOOP ADD R2, R3, R4 SUB R5, R6, R7 F R X M W Condition and Dest Available Here F R X M W Instruction F R X M W Speculate Not Taken Confirm or Branch

  13. Reg Reg Data Memory R4000 Pipeline (I$ start) (I$ finish) (decode/opfetch) (ALU) (D$ start) (D$ finish) (tag check) (write back) IF IS RF EX DS DF TC WB Instruction Memory • How long is load delay? • How long is branch delay? • How many comparators are needed to implement the forwarding decisions? • What instruction sequences will still cause stalls?

  14. How Do We Speed up the Pipeline? • Pipeline too long  more ALUs (exploit ILP) • WAR/WAW hazards  register renaming • Undetermined dependencies at compile time  dynamic scheduling • Object code compatibility • Simplify compiler • Too many branches  better branch prediction • Or use predication to eliminate branches • Unknown dependencies (control/data)  speculate • Explicitly parallel architectures (EPIC) ADD R1,R2,R3 SUB R1,R4,R5 ADD R1,R2,R3 SUB R1’,R4,R5 

  15. Summary • Today: • Reducing impact of data hazards • Control Hazards • How to speed up pipelines • Next Time • ILP • Dynamic Scheduling

More Related