1 / 32

Instruction sets

Instruction sets. Computer architecture taxonomy. Assembly language. von Neumann architecture (P.37). Memory holds data, instructions. Central processing unit (CPU) fetches instructions from memory. Separate CPU and memory distinguishes programmable computer.

marnie
Download Presentation

Instruction sets

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Instruction sets • Computer architecture taxonomy. • Assembly language. Principles of Embedded Computing System Design

  2. von Neumann architecture (P.37) • Memory holds data, instructions. • Central processing unit (CPU) fetches instructions from memory. • Separate CPU and memory distinguishes programmable computer. • CPU registers help out: program counter (PC), instruction register (IR), general-purpose registers, etc. Principles of Embedded Computing System Design

  3. CPU + memory memory address CPU data PC 200 PC ADD r5,r1,r3 ADD r5,r1,r3 IR 200 Principles of Embedded Computing System Design

  4. address CPU data memory data address program memory PC instruction Harvard architecture (P.38) Principles of Embedded Computing System Design

  5. von Neumann vs. Harvard • Harvard can’t use self-modifying code. • Harvard allows two simultaneous memory fetches. • Most DSPs use Harvard architecture for streaming data: • greater memory bandwidth; • more predictable bandwidth. Principles of Embedded Computing System Design

  6. RISC vs. CISC (P.38) • Complex instruction set computer (CISC): • many addressing modes; • many operations. • Reduced instruction set computer (RISC): • load/store; • pipelinable instructions. Principles of Embedded Computing System Design

  7. Instruction set characteristics • Fixed vs. variable length. • Addressing modes. • Number of operands. • Types of operands. Principles of Embedded Computing System Design

  8. Programming model • Programming model: registers visible to the programmer. • Some registers are not visible (IR). Principles of Embedded Computing System Design

  9. Multiple implementations • Successful architectures have several implementations: • varying clock speeds; • different bus widths; • different cache sizes; • etc. Principles of Embedded Computing System Design

  10. Assembly language (P.39) • One-to-one with instructions (more or less). • Basic features: • One instruction per line. • Labels provide names for addresses (usually in first column). • Instructions often start in later columns. • Comments run to end of line. Principles of Embedded Computing System Design

  11. ARM assembly language example label1 ADR r4,c LDR r0,[r4] ; a comment ADR r4,d LDR r1,[r4] SUB r0,r0,r1 ; another comment B label1 Principles of Embedded Computing System Design

  12. Pseudo-ops (P.40) • Some assembler directives don’t correspond directly to instructions: • Define current address. • Reserve storage. • Constants. Principles of Embedded Computing System Design

  13. Example BIGBLOCK % 10 ARM .global BIGBLOCK .var BIGBLOCK[10]=0,0,0,0,0,0,0,0,0,0; SHARC Principles of Embedded Computing System Design

  14. Instruction Set Architecture • ISA provides the level of abstraction for both the software and the hardware. Principles of Embedded Computing System Design

  15. Crafting an ISA • Designing an ISA is both an art and a science • ISA design involves dealing in an extremely rare resource • instruction bits! • Some things we want out of our ISA • completeness • orthogonality • regularity and simplicity • compactness • ease of programming • ease of implementation Principles of Embedded Computing System Design

  16. Key ISA Decisions • Operations • how many? • what kinds? • Operands • how many? • location • types • how to specify? • Instruction format • how does the computer know what 0001 0100 1101 1111 means? • size • how many formats? Principles of Embedded Computing System Design

  17. Operand Location • Can classify machines into 3 types: • Accumulator • Stack • Registers • Two types of register machines • register-memory • most operands can be registers or memory • load-store • most operations (e.g., arithmetic) are only between registers • explicit load and store instructions to move data between registers and memory Principles of Embedded Computing System Design

  18. How Many Operands? • Accumulator: 1 address add A acc <- acc + mem[A] • Stack: 0 address add tos <- tos + next • Register-Memory: 2 address add Ra B Ra <- Ra + mem[B] 3 address add Ra Rb C Ra <- Rb + mem[C] • Load/Store: 3 address add Ra Rb Rc Ra <- Rb + Rc load Ra Rb Ra <- mem[Rb] store Ra Rb mem[Rb] <- Ra Principles of Embedded Computing System Design

  19. Accumulator Architectures • One explicit operand per instruction • A <- A op M • A <- A op *M • *M <- A Principles of Embedded Computing System Design

  20. Stack Architectures • No explicit operands in ALU instructions; one in push/pop • A = B + C * D • push b • push c • push d • mul • add • pop a Principles of Embedded Computing System Design

  21. Register-Set based Architectures • No memory addresses (load/store architecture), typically 3-operand ALU ops • C = A + B • LOAD R1 <- A • LOAD R2 <- B • ADD R3 <- R1 + R2 • STORE C <- R3 Principles of Embedded Computing System Design

  22. Addressing Modes • Register direct Add R4, R3 • Immediate Add R4, #3 • Displacement Add R4, 100 (R1) • Indirect Add R4, (R1) • Indexed Add R3, (R1 + R2) • Direct Add R1, (1001) • Memory indirect Add R1, @(R3) • Autoincrement Add R1, (R2)+ • Autodecrement Add R1, -(R2) • Scaled Add R1, 100(R2)[R3] Principles of Embedded Computing System Design

  23. Addressing Mode Utilization Principles of Embedded Computing System Design

  24. Encoding of Instruction Set Principles of Embedded Computing System Design

  25. Our Desired ISA • Load-Store register arch • Addressing modes • immediate (8-16 bits) (256-65536) • displacement (12-16 bits) (4k-64k) • register deferred (register indirect) • Support a reasonable number of operations • Don’t use condition codes • Fixed instruction encoding/length for performance • Regularity (several general-purpose registers) Principles of Embedded Computing System Design

  26. MIPS Instruction Format Principles of Embedded Computing System Design

  27. ARM IS Principles of Embedded Computing System Design

  28. Compiler/ISA Interaction • Compiler is primary customer of ISA • Features the compiler doesn’t use are wasted • Register allocation is a huge contributor to performance • Compiler-writer’s job is made easier when ISA has • regularity • primitives, not solutions • simple trade-offs • Compiler wants • simplicity over power Principles of Embedded Computing System Design

  29. Program Usage of Addressing Modes Principles of Embedded Computing System Design

  30. A simple loop int A[100], B[100], C; main () { int i; c=10; for (i=0; i<100; i++) A[i] = B[i] + C; } Principles of Embedded Computing System Design

  31. Unoptimized code • C=10; • li r14, 10 • sw r14, C • for (i=0; i<100; i++) • sw r0, 4(sp) • $33: • A[i] = B[i] + C • lw r14, 4(sp) • mul r15, r14, 4 • lw r24, B(r15) • lw r25, C • addu r8, r24, r25 • lw r16, 4(sp) • mul r17, r16, 4 • sw r8, A(r17) • lw r9, 4(sp) • addu r10, r9, 1 • sw r10, 4(sp) • blt r10, 100, $33 • j $31 • 12 instructions per iteration Principles of Embedded Computing System Design

  32. Optimized code • C=10; • li r14, 10 • sw r14, C • for (i=0; i<100; i++) • la r3, A • la r4, B • la r6, B+400 • $33: • A[i] = B[i] + C • lw r14, 0(r4) • addu r15, r14, 10 • sw r15, 0(r3) • addu r3, r3, 4 • addu r4, r4, 4 • bltu r4, r6, $33 • j $31 6 instructions per iteration 4 fewer loads due to code motion, register allocation, constant propagation 2 fewer multipies due to induction variable elimination, strength reduction Can you do better by hand? Principles of Embedded Computing System Design

More Related