1 / 28

Pipeline Control And Pipeline Hazards 15 March 2016

CDA 3101 Spring 2016 Introduction to Computer Organization. Pipeline Control And Pipeline Hazards 15 March 2016. Control Signals. PCSrc. Mux. IF/ID. Add. ID/EX. EX/MEM. Shift left 2. MEM/WB. Branch. RegWrite. 4. ALUSrc. Zero.

mholden
Download Presentation

Pipeline Control And Pipeline Hazards 15 March 2016

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CDA 3101 Spring 2016 Introduction to Computer Organization Pipeline Control And Pipeline Hazards 15 March 2016

  2. Control Signals PCSrc Mux IF/ID Add ID/EX EX/MEM Shift left 2 MEM/WB Branch RegWrite 4 ALUSrc Zero MemtoReg Add MemWrite ALU Regs Mux Mux Instr. Mem Data Mem PC ALU Control Sign extend MemRead rt[20-16] Mux ALUOp rd[15-11] RegDst

  3. ALU Control Input

  4. Control Lines

  5. Control Implementation • Pipelining leaves the meaning of the 9 control lines unchanged • Set control lines (to defined values) in each stage for each instruction • Extend pipeline registers to include control information • Nothing to control during IF and ID • Create control information during ID

  6. WB Control M WB EX M WB Generation/Propagation of Control Instruction IF/ID ID/EX EX/MEM MEM/WB

  7. PCSrc W B W B Control M W B Mux E M IF/ID Add ID/EX EX/MEM Shift left 2 MEM/WB Branch RegWrite 4 ALUSrc MemtoReg Zero Add MemWrite ALU Regs Mux Mux Instr. Mem Data Mem PC ALU Control Sign extend MemRead ALUOp rt[20-16] Mux rd[15-11] RegDst

  8. Example lw $10, 20($1) sub $11, $2, $3 and $12, $4, $5 or $13, $6, $7 add $14, $8, $9

  9. Cycle 1

  10. Cycle 2

  11. Cycle 3

  12. Cycle 4

  13. Cycle 5

  14. Cycle 6

  15. Limits to Pipelining • Hazards prevent next instruction from executing during its designated clock cycle • Structural hazards • HW cannot support this combination of instructions • Ex: Single person to fold and put clothes away • Control hazards • Branches stall the pipeline until the hazard “bubbles” in the pipeline • Data hazards • Instruction depends on result of prior instruction • Ex: Missing sock

  16. Pipeline Hazards (Example) 2 AM 12 6 PM 1 8 7 11 10 9 Time 30 30 30 30 30 30 30 A A A D D D T a s k O r d e r Bag A: Control puts 90m bubble in pipeline be-tween dryer and folder (done 9pm) Bag D: Cannot complete until 10:30pm (one folder available) bubble B C E F • Jim’s green socks : one in other in • depends on stallsince folder busy

  17. I$ ALU I$ D$ Reg Reg ALU I$ D$ Reg Reg ALU ALU I$ D$ Reg Reg ALU Structural Hazard 1: Single Memory Time (clock cycles) I n s t r. O r d e r D$ Reg Reg Load Instr 1 Instr 2 I$ D$ Reg Reg Instr 3 Instr 4 IM = DM => Read same memory twice in one clock cycle

  18. I$ ALU I$ D$ Reg Reg ALU I$ D$ Reg Reg ALU ALU I$ D$ Reg Reg ALU Structural Hazard 2: Register File Time (clock cycles) I n s t r. O r d e r D$ Reg Reg Load Instr 1 Instr 2 I$ D$ Reg Reg Instr 3 Instr 4 Try read and write to registers simultaneously

  19. Structural Hazards: Solutions • Structural hazard 1: single memory • Two memories? infeasible and inefficient => Two Level 1 caches (instruction and data) • Structural hazard 2: register file • Register access takes less that ½ ALU stage time => Use the following convention: • Always Write during first half of each cycle • Always Read during second half of each cycle • Both, Read and Write can be performed during the same clock cycle (a small delay between)

  20. Control Hazard: Branch Instr. (1/2) • Branch decision-making hardware in ALU stage • Two more instructions after the branch will always be fetched, whether or not the branch is taken • Desired functionality of a branch • if we do not take the branch, don’t waste any time and continue executing normally • if we take the branch, don’t execute any instructions after the branch, just go to the desired label

  21. Control Hazard: Branch Instr. (2/2) • Initial Solution: Stall until decision is made • Insert “no-op” instructions: those that accomplish nothing, just take time • Drawback: branches take 3 clock cycles each (assuming comparator is put in ALU stage) • Better Solution: Move comparator to Stage 2 • Benefit: since branch is complete in Stage 2, only one unnecessary instruction is fetched • Therefore, only one no-op is needed • This means that branches are idle in Stages 3, 4 and 5.

  22. I$ ALU I$ ALU bubble ALU Control Hazard: Better Sol’n. • Move comparator up to Stage 2 • Benefit: since branch is complete in Stage 2, only one unnecessary instruction is fetched, so only one no-op is needed • This means that branches are idle in Stages 3, 4 and 5. Time (clock cycles) I n s t r. O r d e r D$ Reg Reg Add D$ Reg Reg Beq Load D$ Reg Reg I$

  23. Best: Delayed Branches (1/2) • If we take the branch, none of the instructions after the branch get executed by accident • New definition: whether or not we take the branch, the instruction immediately following the branch gets executed (called the branch-delay slot)

  24. Best: Delayed Branches (2/2) • Notes on Branch-Delay Slot • Worst-Case Scenario: can always use a no-op • Better Case: can find an instruction preceding the branch which can be placed in the branch-delay slot without affecting flow of the program • Re-ordering instructions is a common speedup technique – done in compiler • Compiler must be smart in order to find instructions to do this • Usually can find such an instruction at least 50% of the time - REAL STUFF!!

  25. or $8, $9 ,$10 add $1 ,$2,$3 sub $4, $5,$6 add $1 ,$2,$3 beq $1, $4, Exit sub $4, $5,$6 beq $1, $4, Exit or $8, $9 ,$10 xor $10, $1,$11 xor $10, $1,$11 . . . . . . Exit:  Exit: Nondelayed vs. Delayed Nondelayed Branch Delayed Branch

  26. Conclusions (1/2) • Optimal Pipeline • Each stage is executing part of an instruction each cycle. • One instruction finishes during each clock cycle. • On average, execute far more quickly • What makes this work? • Similarities between instructions • Each stage takes about the same amount of time as all others

  27. Conclusions (2/2) • Pipelining a Big Idea: widely used concept • What makes it less than perfect? • Structural hazards:  Need more HW resources • Control hazards:  Delayed branch • Data hazards: an instruction depends on a previous one • Next Topic:Pipeline Performance Issues • Wednesday: EXAM #2

  28. Anticipate the Weekend

More Related