1 / 29

ECE3055 Computer Architecture and Operating Systems Lecture 5 Datapath

ECE3055 Computer Architecture and Operating Systems Lecture 5 Datapath. Prof. Hsien-Hsin Sean Lee School of Electrical and Computer Engineering Georgia Institute of Technology. The Processor: Datapath and Control. We're ready to look at an implementation of the MIPS

samara
Download Presentation

ECE3055 Computer Architecture and Operating Systems Lecture 5 Datapath

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ECE3055 Computer Architecture and Operating SystemsLecture 5 Datapath Prof. Hsien-Hsin Sean Lee School of Electrical and Computer Engineering Georgia Institute of Technology

  2. The Processor: Datapath and Control • We're ready to look at an implementation of the MIPS • Simplified to contain only: • memory-reference instructions: lw, sw • arithmetic-logical instructions: add, sub, and, or, slt • control flow instructions: beq, j • Generic Implementation: • use the program counter (PC) to supply instruction address • get the instruction from memory • read registers • use the instruction to decide exactly what to do • All instructions use the ALU after reading the registers Why? memory-reference? arithmetic? control flow?

  3. More Implementation Details • Abstract / Simplified View:Two types of functional units: • Elements that operate on data values (combinational) • Elements that contain state (sequential)

  4. State Elements • Unclocked vs. Clocked • Clocks used in synchronous logic • when should an element that contains state be updated? falling edge cycle time rising edge

  5. No Change Reset Set Undefined An unclocked state element • The set-reset latch • output depends on present inputs and also on past inputs R Q QN S

  6. Latches and Flip-flops • Output is equal to the stored value inside the element (don't need to ask for permission to look at the value) • Change of state (value) is based on the clock • Latches: whenever the inputs change, and the clock is asserted • Flip-flop: state changes only on a clock edge (edge-triggered methodology) "logically true", — could mean electrically low A clocking methodology defines when signals can be read and written — wouldn't want to read a signal at the same time it was being written

  7. D-latch • Two inputs: • the data value to be stored (D) • the clock signal (C) indicating when to read & store D • Two outputs: • the value of the internal state (Q) and it's complement

  8. Q 10T D Latch w/ Transmission Gates C C Q D En

  9. D D Q 10T D Latch w/ Transmission Gates C=1 C Q D D En Writing Data

  10. D D Q 10T D Latch w/ Transmission Gates C=0 C Q D D_new En Writing Data

  11. En D Q Oscillating  Unstable Unstable Problem of Transparency D Q D Transparent D-Latch C 1

  12. D flip-flop • Output changes only on the clock edge

  13. Our Implementation • An edge triggered methodology • Typical execution: • read contents of some state elements, • send values through some combinational logic • write results to one or more state elements

  14. Register File • Built using D flip-flops

  15. Register File • Note: we still use the real clock to determine when to write

  16. Simple Implementation • Include the functional units we need for each instruction A L U c o n t r o l 5 3 R e a d r e g i s t e r 1 R e a d d a t a 1 5 R e g i s t e r R e a d Z e r o r e g i s t e r 2 n u m b e r s R e g i s t e r s D a t a A L U A L U 5 W r i t e r e s u l t r e g i s t e r R e a d d a t a 2 W r i t e D a t a d a t a R e g W r i t e a . R e g i s t e r s b . A L U

  17. Building the Datapath • Use multiplexers to stitch them together

  18. P C S r c 1 M A d d u x A L U 0 4 A d d r e s u l t S h i f t R e g W r i t e l e f t 2 I n s t r u c t i o n [ 2 5 – 2 1 ] R e a d r e g i s t e r 1 R e a d M e m W r i t e R e a d P C d a t a 1 I n s t r u c t i o n [ 2 0 – 1 6 ] a d d r e s s R e a d M e m t o R e g A L U S r c r e g i s t e r 2 Z e r o I n s t r u c t i o n R e a d 1 A L U A L U [ 3 1 – 0 ] 1 R e a d W r i t e d a t a 2 1 A d d r e s s r e s u l t M r e g i s t e r M d a t a u M I n s t r u c t i o n u I n s t r u c t i o n [ 1 5 – 1 1 ] x W r i t e u x m e m o r y R e g i s t e r s x 0 d a t a 0 D a t a 0 W r i t e m e m o r y R e g D s t d a t a 1 6 3 2 S i g n I n s t r u c t i o n [ 1 5 – 0 ] e x t e n d A L U M e m R e a d c o n t r o l I n s t r u c t i o n [ 5 – 0 ] A L U O p R-Type Instructions (e.g. add $2, $3, $4; Not JR/JALR)

  19. P C S r c 1 M A d d u x A L U 0 4 A d d r e s u l t S h i f t R e g W r i t e l e f t 2 I n s t r u c t i o n [ 2 5 – 2 1 ] R e a d r e g i s t e r 1 R e a d M e m W r i t e R e a d P C d a t a 1 I n s t r u c t i o n [ 2 0 – 1 6 ] a d d r e s s R e a d M e m t o R e g A L U S r c r e g i s t e r 2 Z e r o I n s t r u c t i o n R e a d 1 A L U A L U [ 3 1 – 0 ] 1 R e a d W r i t e d a t a 2 1 A d d r e s s r e s u l t M r e g i s t e r M d a t a u M I n s t r u c t i o n u I n s t r u c t i o n [ 1 5 – 1 1 ] x W r i t e u x m e m o r y R e g i s t e r s x 0 d a t a 0 D a t a 0 W r i t e m e m o r y R e g D s t d a t a 1 6 3 2 S i g n I n s t r u c t i o n [ 1 5 – 0 ] e x t e n d A L U M e m R e a d c o n t r o l I n s t r u c t i o n [ 5 – 0 ] A L U O p I-Type Instructions (e.g. lw $4, 1000($15))

  20. P C S r c 1 M A d d u x A L U 0 4 A d d r e s u l t S h i f t R e g W r i t e l e f t 2 I n s t r u c t i o n [ 2 5 – 2 1 ] R e a d r e g i s t e r 1 R e a d M e m W r i t e R e a d P C d a t a 1 I n s t r u c t i o n [ 2 0 – 1 6 ] a d d r e s s R e a d M e m t o R e g A L U S r c r e g i s t e r 2 Z e r o I n s t r u c t i o n R e a d 1 A L U A L U [ 3 1 – 0 ] 1 R e a d W r i t e d a t a 2 1 A d d r e s s r e s u l t M r e g i s t e r M d a t a u M I n s t r u c t i o n u I n s t r u c t i o n [ 1 5 – 1 1 ] x W r i t e u x m e m o r y R e g i s t e r s x 0 d a t a 0 D a t a 0 W r i t e m e m o r y R e g D s t d a t a 1 6 3 2 S i g n I n s t r u c t i o n [ 1 5 – 0 ] e x t e n d A L U M e m R e a d c o n t r o l I n s t r u c t i o n [ 5 – 0 ] A L U O p I-type Instruction for Branches(e.g. beq $4, $5, Label7)

  21. Control • Selecting the operations to perform (ALU, read/write, etc.) • Controlling the flow of data (multiplexer inputs) • Information comes from the 32 bits of the instruction • Example:add $8, $17, $18 Instruction Format:000000 10001 10010 01000 00000 100000 op rs rt rd shamt funct • ALU's operation based on instruction type and function code

  22. A L U c o n t r o l 3 Z e r o A L U A L U r e s u l t Control • e.g., what should the ALU do with this instruction • Example: lw $1, 100($2) 35 2 1 100 op rs rt 16 bit offset • ALU control input000 AND 001 OR 010 add 110 subtract 111 set-on-less-than • Why is the code for subtract 110 and not 011? What do you need for slt instruction?

  23. ALUOp computed from instruction type A L U c o n t r o l 3 ALUOp Funct field ALU Control ALUOp1 ALUOp0 F5 F4 F3 F2 F1 F0 Z e r o A L U A L U 0 0 X X X X X X 010 r e s u l t X 1 X X X X X X 110 1 X X X 0 0 0 0 010 1 X X X 0 0 1 0 110 1 X X X 0 1 0 0 000 1 X X X 0 1 0 1 001 1 X X X 1 0 1 0 111 Control the ALU • Must describe hardware to compute 3-bit ALU control input • given instruction type 00 = lw, sw 01 = beq, 11 = arithmetic (incl. slt) • function code for arithmetic • Describe it using a truth table (can turn into gates): ALUOp funct = inst[5:0] ALU control lw/sw add beq sub add sub arith and or slt Generated from Decoding inst[31:26] inst[5:0]

  24. ALU Control • Simple combinational logic (truth tables)

  25. Memto- Reg Mem Mem Instruction RegDst ALUSrc Reg Write Read Write Branch ALUOp1 ALUp0 R-format 1 0 0 1 0 0 0 1 0 lw 0 1 1 1 1 0 0 0 0 sw X 1 X 0 0 1 0 0 0 beq X 0 X 0 0 0 1 0 1 0 M u x A L U A d d 1 r e s u l t A d d S h i f t l e f t 2 R e g D s t 4 B r a n c h M e m R e a d M e m t o R e g I n s t r u c t i o n [ 3 1 – 2 6 ] C o n t r o l A L U O p M e m W r i t e A L U S r c R e g W r i t e I n s t r u c t i o n [ 2 5 – 2 1 ] R e a d R e a d r e g i s t e r 1 P C R e a d a d d r e s s d a t a 1 I n s t r u c t i o n [ 2 0 – 1 6 ] R e a d Z e r o r e g i s t e r 2 I n s t r u c t i o n 0 R e g i s t e r s A L U R e a d A L U [ 3 1 – 0 ] 0 R e a d W r i t e M d a t a 2 A d d r e s s r e s u l t 1 d a t a I n s t r u c t i o n r e g i s t e r M u M u m e m o r y x u I n s t r u c t i o n [ 1 5 – 1 1 ] W r i t e x 1 D a t a x d a t a 1 m e m o r y 0 W r i t e d a t a 1 6 3 2 I n s t r u c t i o n [ 1 5 – 0 ] S i g n e x t e n d A L U c o n t r o l I n s t r u c t i o n [ 5 – 0 ] Use rt not rd

  26. Control Unit Signals Inst[31:26] To harness the datapath

  27. Our Simple Control Structure • All of the logic is combinational • We wait for everything to settle down, and the right thing to be done • ALU might not produce “right answer” right away • we use write signals along with clock to determine when to write • Cycle time determined by length of the longest path We are ignoring some details like setup and holdtimes

  28. Single Cycle Implementation • Calculate cycle time assuming negligible delays except: • memory (2ns), ALU and adders (2ns), register file access (1ns)

  29. Where we are headed • Single Cycle Problems: • what if we had a more complicated instruction like floating point? • wasteful of area • One Solution: • use a “smaller” cycle time • have different instructions take different numbers of cycles • a “multicycle” datapath:

More Related