1 / 31

Chapter 5 - The Processor

Chapter 5 - The Processor. Machine Performance factors Instruction Count, Clock cycle time, Clock cycles per instruction (CPI) Both clock cycle time and CPI are determined by processor implementation

linnea
Download Presentation

Chapter 5 - The Processor

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 5 - The Processor • Machine Performance factors • Instruction Count, Clock cycle time, Clock cycles per instruction (CPI) • Both clock cycle time and CPI are determined by processor implementation • We will construct datapath and a control unit for 2 different processor implementations for ‘core’ instructions • Memory ref: lw/sw • Arithmetic: add/sub/and/or/slt • Control: beq/j

  2. Implementation Overview • Consider a core subset of MIPS instructions: • Integer arith-log instructions • Memory-reference instructions • Branch instructions • Good news is that much is similar across different instructions • For every instruction • Set the PC to a memory location to fetch an instruction • Read one or two registers using instructions fields to choose registers

  3. Differing Instructions • After previous 2 steps, instructions diverge • All instructions do use the ALU next • Arith-log: for opcode execution • Mem-ref: for effective address calculation • Branches: for comparison • After using the ALU • Arith-log: write data from ALU to register • Mem-ref: access memory containing data to complete a store or retrieve a word being loaded • Branch: may need to exchange next instruction address based on comparison

  4. High-level view • Two types of functional units: • elements that operate on data values (combinational) • elements that contain state (sequential)

  5. Clocking methodology • Defines when signals can be read and when they can be written • Assume an edge-triggered clock • Clock cycles between high and low • Clock period: time for one full cycle

  6. MIPS subset implementation • Develop 2 implementations • Single long clock cycle for each instruction (simple) • Multiple clock cycles per instructions (complex) • Input / Output • Nearly all elements have 32 bit wide inputs/outputs • Buses: signals > 1 bit (thick lines) • Control signals vs data signals • Notation: control in colour

  7. Building Blocks • Instruction Memory: a place to store program instructions • Program Counter (PC): the address of an instruction • Adder: to increment the PC to the instruction location

  8. The common bit • Instruction execution • Fetch instruction from memory • Increment PC to next instruction (PC += 4)

  9. R-format • add, sub, slt, and, or • E.g add $1,$2,$3 ($1 = $2+$3) • Need fourth element: Register file • Contains register state of the machine • Register can be read or written by specifying number • 2 read ‘ports’ and 1 write ‘port’ • 32 registers => 5 bit register number • Fifth element: ALU • 3 bit operation signal

  10. R-type elements

  11. R-format execution • Only two elements required • Read 2 registers • Perform ALU operation on the contents of the registers • Write the result A L U o p e r a t i o n 3 R e a d r e g i s t e r 1 R e a d d a t a 1 R e a d Z e r o r e g i s t e r 2 I n s t r u c t i o n R e g i s t e r s A L U A L U W r i t e r e s u l t r e g i s t e r R e a d d a t a 2 W r i t e d a t a R e g W r i t e

  12. Load and Store Operations • lw $1, offset_value($2) • sw $1, offset_value($2) • Address found by adding offset to contents of $2 • Besides previous elements, need • Sixth element: Data Memory Unit • State element with inputs (read address, write address, write data) and a ‘read data’ output • Seventh element: Sign Extension Unit • Memory addresses are all 32 but, so ‘offset’ is extended from 16 to 32 bits

  13. Sign extension • Consider 16 bit version of 2 • 0000 0000 0000 0010 • Sign extend by copying most significant bit into the new 32bit word • 0000 0000 0000 0000 0000 0000 0000 0010 • Consider 16 bit of -1 (1->0, 0->1 and add 1) • 1111 1111 1111 1110 • -> 1111 1111 1111 1111 1111 1111 1111 1110 • One of the ‘magic’ reasons for using 2’s compliment

  14. Sixth and Seventh logic elements

  15. Executing load and store • Address in memory is sign extend (offset + contents of $2) • Store: value from $1 is put in this location • Load: value from location is put in to $1

  16. Branch instruction • beq $1, $2, offset • Need to compare the contents of $1 and $2 • If they are equal, we need to calculate a new value for the PC using the offset • The offset is relative to the branch instruction • So we need to add it to the current PC • The offset is a word offset, not a byte offset!

  17. Word offset • If the offset was a byte offset, the last two bits would always be ‘00’ as instructions take 4 bytes of memory: • 0, 4, 8, 12, 16, 20 etc. • 00000, 00100, 01000,01100, 10000, 10100 etc. • This is wasteful • By using a word offset, the range is extended by a factor of four

  18. Executing branch • If ($1 == $2) PC = PC + (offset << 2)

  19. Putting it all together - a simple implementation • We know what elements we need, but we need control (mysterious orange lines) • If creating a single datapath • Execute everything in one cycle • No datapath resource used more than once per instruction (duplication) • Elements common to different instructions can be shared - implies multiplexor • Selector for multiple inputs to the same element port A M U X C B S

  20. Combined path • Key differences between arith-log and mem-ref: Second ALU input & Result register input

  21. Adding branch path • Use adder to compute target address • Another Mux for PC

  22. Control - the ALU • 5 of 8 options used • Need to generate 3 bit input code to ALU for each instruction type • 3 types of code implies 2 bit control (ALUOp)

  23. ALU control for instruction types

  24. Main control • ALU control relatively easy (not temporal) • PLA / Simple custom controller • To define the rest of the control circuit • Identify control lines and instruction components • Before we do that, we need to look at the instruction types to understand data bus requirements

  25. Instruction analysis Target register* Base register 31-26 25-21 20-16 15-11 10-6 5-0 R op rs rt rd shamt funct LS op rs rd address B op rs rt address offset * This implies a Mux

  26. What does that look like?

  27. What do the orange bits do? • RegDest • Source of the destination register for the operation • RegWrite • Enables writing a register in the register file • ALUsrc • Source of second ALU operand, can be a register or part of the instruction • PCsrc • Source of the PC (increment [PC + 4] or branch) • MemRead / MemWrite • Reading / Writing from memory • MemtoReg • Source of write register contents

  28. Building the control unit • All but one of the 7 lines can be set using op-code bits • PCSrc is determined by output from the ALU as well as op-code (need an AND gate) • Besides this 7, there are 2 for the ALUOp • To set these, all we need are the 6 bits determining the op-code

  29. Bunch up - inserting the control unit

  30. Truth table

  31. Sample R-type execution • Instruction fetched and PC incremented • $2 and $3 are read from register file • ALU operates on the data • The result from the ALU is written to register file

More Related