1 / 26

b1110 Multi-Cycle CPUs

b1110 Multi-Cycle CPUs. ENGR xD52 Eric VanWyk Fall 2012. Acknowledgements. Mark L. Chang lecture notes for Computer Architecture (Olin ENGR3410) Microchip Technology Inc (Datasheet) A. Sahu , Indian Institute of Technology. Today. Recall Single-Cycle CPUs Single-Cycle Shortcomings

chet
Download Presentation

b1110 Multi-Cycle CPUs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. b1110Multi-Cycle CPUs ENGR xD52 Eric VanWyk Fall 2012

  2. Acknowledgements • Mark L. Chang lecture notes for Computer Architecture (Olin ENGR3410) • Microchip Technology Inc (Datasheet) • A. Sahu, Indian Institute of Technology

  3. Today • Recall Single-Cycle CPUs • Single-Cycle Shortcomings • Multi-Cycle CPUs

  4. Instruction Fetch Instruction Decode Operand Fetch Execute Result Store Next Instruction Execution Overview • Fetch instruction from memory • Decode instruction into actions/controls • Fetch/Decode operands • Compute result value or status • Push result(s) to storage • Determine next instruction

  5. Instruction Fetch Instruction Decode Operand Fetch Execute Result Store Next Instruction Execution Overview • Fetch instruction from memory • Decode instruction into actions/controls • Fetch/Decode operands • Compute result value or status • Push result(s) to storage • Determine next instruction • Reference Lecture b1001

  6. Processor Overview Overall Dataflow PC fetches instructions Instructions select operand registers, ALU immediate values ALU computes values Load/Store addresses computed in ALU Result goes to register file or Data memory

  7. Information Rippling • Each Clock Cycle “ripples” left to right • Elements emit garbage values until they stabilize • Elements on the Left (leading edge of cycle) • Spend most of the time “bored” (stable) • Under-utilized • Elements on the Right (lagging edge of cycle) • Spend most of the time twitching as things settle • Spend unnecessary dynamic power

  8. PC Instr. Memory Reg Read mux ALU Adder mux PCsetup PC Instr. Memory Reg Read mux ALU Data Memory PC Instr. Memory Reg Read mux ALU mux Reg Setup PC Instr. Memory mux PCsetup PC Instr. Memory Reg Read mux ALU Data Memory mux Reg Setup Performance of Single-Cycle Machine Clock speed is set by the slowest instruction Arithmetic & Logic Load Store Branch Jump

  9. storage element storage element Acyclic Combinational Logic (A) Acyclic Combinational Logic storage element Acyclic Combinational Logic (B) storage element storage element Reducing Cycle Time • Cut combinational dependency graph and insert register / latch • Do same work in N fast cycles, rather than one slow one

  10. Goals • Free up fast instructions from slow clock curse • Load Word is Super Slow • Make better use of available resources • Reuse components (2 Memories, 2 or 3 Adders) • Free up space to speed up components • One big fast ALU, not multiple slow ALU/adders

  11. Multi-Cycle CPUs in the Wild • Common in small embedded spaces • PIC16, PIC18 • 4 bit micros (watches) • Marketing will try to confuse you • Advertise MHz • Hide CPI, Instructions per second, etc http://ww1.microchip.com/downloads/en/DeviceDoc/41213D.pdf

  12. Preview of White Boards to Come • We will go to the white boards “later”. • You will create the schematic necessary to run the RTL design I’m about to give you. • I’m “cheating” and giving you parts of MY answer, to make your reinvention smoother • There is nothing sacred about MY answer, • You’re usually on the hook for the whole shebang • If it fulfils the contract and is small/fast, awesome.

  13. Strategy • Enumerate all the stuff we have to do • Per Instruction • Highlight Common Features • Break tasks in to N phases • Balance work done in each phase

  14. “Typical” Phases • IF: Instruction Fetch • ID: Instruction Decode (& register fetch) • EX: Execute • MEM: Read from Memory • WB: Write Back to Memory • Other Architectures make different divisions!

  15. New Registers • Instruction Register (IR) • Instruction fetched from Data Memory • Data Register (DR) • Data fetched from Data Memory • Operands (A, B) • Fetched from Register File • Result (Res) • Result of the ALU calculation

  16. Phases: Load Word • IF: Instruction Register = Memory[PC] PC=PC+4 • ID: A = RegFile[rt] B = RegFile[IR[16:20]] • EX: Result = A + sign extended immediate • MEM: DataReg = Mem[Result] • WB: RegFile[rs] = DataReg

  17. IR Rs Rt Rd Imm16 WrEn Addr Dout Memory Din Aw Ab Aa Da Registers Dw WrEn Db SignExtnd PC <<2 MDR ALU RES B A Multi Cycle w/ Controls PCSrc MemIn ALUOp ALUSrcA PC_WE IR_WE Mem_WE Concat 4 Dst ALUSrcB Reg_WE RegIn

  18. Desk Work • Time your Multicycle design from Monday • Do symbolically first, then substitute real numbers • Remember parallel paths!

  19. Phases: ADD • IF: Instruction Register = Memory[PC] PC=PC+4 • ID: A = RegFile[rs] B = RegFile[rt] • EX: Result = A + B • MEM: • WB: RegFile[rd] = Result

  20. Phases: Store Word • IF: Instruction Register = Memory[PC] PC=PC+4 • ID: A = RegFile[rs] B = RegFile[rt] • EX: Result = A + sign extended immediate • MEM: Mem[Result] = B • WB:

  21. Phases: Branch if Equal • IF: Instruction Register = Memory[PC] PC=PC+4 • ID: A = RegFile[rs] B = RegFile[rt] Res = PC + sign extended immediate • EX: if(A==B) PC = Res • MEM: • WB:

  22. Phases: Jump • IF: Instruction Register = Memory[PC] PC=PC+4 • ID: PC = PC[31:28],IR[25:0],b00 • EX: • MEM: • WB:

  23. Example Control Diagram

  24. Lets Make It • Create a Multi-Cycle CPU that can do the instructions on prior pages • Jump, Branch, R-type, I-type, LW, SW • Show everything except the actual decode logic • Reference Lecture b1001, but put everything on one “page” • Sketch a Schematic for your Multi Cycle CPU • ALU, Register File, Unified Instruction/Data Memory • Sign Extender, (Optional) Shift by Two • IR, DR, A, B, Res Registers • OMG MUXES EVERYWHERE • Create a control chart (to show the actual decode logic) • Show each cycle of each instruction • Mux selects, ALU Control Lines, Register Enables • Use “X” for “Don’t Care” • We will informally present for the last 15 minutes

  25. Summary • Split Single Cycle in to multiple cycles • Use variable number of cycles per instruction • No More Harrison Bergeron-ing • Most Instructions become Faster • Longest Instruction gets Longer • From unbalanced phases • Costs: Registers, control logic

  26. Preview of things to Come • Lab 2: The ALU • How to control a Multi-Cycle CPU • Timing Concerns and Explicit Balancing • Modern CPUs: Pipelining

More Related