1 / 68

CPE 471

CPE 471. Assemblers. Assembly Instructions. Four parts to an instruction: Label(s) Operation Operands Comments We are using a flexible format Instruction is still on one line (except label) Spaces and tabs can appear between “tokens” Alternative: fixed format

nani
Download Presentation

CPE 471

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CPE 471 Assemblers

  2. Assembly Instructions • Four parts to an instruction: • Label(s) • Operation • Operands • Comments • We are using a flexible format • Instruction is still on one line (except label) • Spaces and tabs can appear between “tokens” • Alternative: fixed format • Label in column 1; Op in column 16; Operands in col 24, etc. CPE 471

  3. Assembly Instructions • Label definition: • Symbolic name for instructions address • Clarifies branching to an address and data's address • Often severely restricted in length • Starts anywhere before operation • Contains letters (a-zA-Z), digits, ‘$’, ‘_’, ‘.’ CPE 471

  4. Assembly Instructions • Operation field • Mnemonic for an instruction (ADD, BRA) or • Mnemonic for a pseudo-instruction (DC, EQU) • Operand field • Addresses and registers used by instruction • In other words, what to add, where to branch • Comment field: no affect on assembler • Used for documentation. CPE 471

  5. What's an Assembler • Translation: • Source program translated into target • Source and target define levels • Source is not directly executed (what is that?) • Target or object file is executed or translated later • Assembly language means the source is a symbolic representation of machine language • When the source is higher level the translator is usually called a compiler CPE 471

  6. Advantages of Assembly • Compared to machine code • Easier to remember (HLT vs 26577) (symbolic) • Similarly for addresses in program (symbolic) • BGT loop1 vs • BGT 6210 • Over high-level language • Access to full capabilities of the machine • testing overflow flag, test & set (atomic), registers • Speed CPE 471

  7. Advantages of Assembly • Often a holdover from when machines were expensive and people where cheap • Systems programming often done in a language like C • Old myth: "if a program will be used a lot, it should (for efficiency) be written in assembly" • Hard to write (10 lines/day independent of language) • Hard to read: high maintenance, high turnover • Reality: good compilers, fast machines CPE 471

  8. Modern Approach • Write in high level language • Analyze to find where time spent • Invariably, it's a tiny part of the code • Improve that tiny part (perhaps with assembly) • Problem oriented language allows high level insights • Algorithmic insights save tremendously • Assembly programmer immersed in bit-twiddling (penny wise, pound foolish) CPE 471

  9. Types of Assemblers • Assemblers can be classified according to the number of passes (scanning the source file) to: • One-pass assembler: some restrictions are imposed on the forward referencing such as eliminating forward references to data. • Two-pass assembler: the only restriction on forward referencing is in the symbol definition, i.e., all assembler directives (e.g. EQU and ORG) that define symbols can only use symbols that are previously defined. • Multi-pass assembler, restrictions are made on the level of nesting in forward referencing. CPE 471

  10. Assembler Tasks • Parse assembly instructions • Check for syntax • Tokenize string • Assigns addresses to instructions • Maintains location counter • LC = eventual location in memory of this instruction • Generate machine code • Evaluation mnemonic instructions (replace w/opcode) • Evaluate operand sub-field (replace symbols with value) CPE 471

  11. Assembler Tasks • Concatenate to form instructions • Process pseudo-ops • generate header record • evaluate DC, DS, … etc • Write output object file • Nothing here seems all that hard! CPE 471

  12. Example: First Attempt • Read each input line and generate machine code • Associate symbol with location counter • Lookup mnemonic and get opcode • Generate instruction • Example: CPE 471

  13. Example Test ORIG $0100 A EQU 16 Begin LD R0,N LD R1, #A ST R0, ANS HLT N DC 13 ANS DS 1 END Begin CPE 471

  14. Data Structures • Location counter (LocCounter) • Search a table (OpTable) for mnemonic • Get opcode • Prepare to handle arguments (group like instructions together?) • Translate arguments • Lookup/add symbol names (SymTable) and replace with location • Watch out for relative offsets CPE 471

  15. Location Counter • Eventual address of instruction • Initialized with 0 • Increment with each instruction (see OpTable). Always two for our machine CPE 471

  16. 2-pass Assembler • Solution: 2-pass assembler • Pass #1: identify and define all labels • As location of each label is determined, save in SymTable • Requires some knowledge of instructions • Pass #2: generate object code • Whenever a label is used, replace with value saved in table CPE 471

  17. Pass I: • Determine length of machine instructions. • Keep track of Location Counter (LC). • Generate a table of symbols for pass 2. • Process some pseudo operations. • Remember literals. CPE 471

  18. Pass II • Look up value of symbols. • Generate machine code for instructions. • Generate data for DS, DC, and literals. • Process pseudo operations. CPE 471

  19. Pass 1 & 2 Communication • Scan text file twice • Save symbol locations first pass, then plug in 2nd • Simpler • Disadvantage: slow • 6,800 instructions medium size file • 52,000 instructions large file CPE 471

  20. Big Picture: Labs 1 & 2 Assembly File Assembler Lab 2 Object File Disassembler Lab 1 Linker Lab 3 Executable File CPE 471

  21. Table Driven Software • Many times software is very repetitive • Use functions! • Many times the information processes is repetitive • Use loops • Many times the information to write the code is repetitive but static: • Use tables CPE 471

  22. Table Driven Software • Easier to modify: add new entries, change existing ones, well centralized • Code is easier, eventually, to understand • Works if there are not many exceptions CPE 471

  23. Machine OP Table • This table is static (unchanging) • For our machine: • All opcodes are 6 bits • Instructions size is one or two words. • Formats do differ CPE 471

  24. Machine Op Table • Variable length instructions: • "Branch relative": PC <- PC + operand • Near: operand is 9 bits, far operand is 16 bits. • Varying formats • Fixed format makes parsing simple CPE 471

  25. Machine Op Table • Fields might include: • Name: “add” • Type: ALU • opcode: 000000 • Size: 2 or 4 • One entry for each instruction CPE 471

  26. Symbol Table • Pass 1: • Each symbol is defined. Every time a new symbol seen (i.E., In a label) put it in the symbol table • If already there, error CPE 471

  27. Symbol Table (Pass #1) • Each symbol gives a value • Explicit: A EQU 16 • Easy: just put operand in table • Implicit: N DC 20 • Must know address of instruction • Therefore, keep track of addresses as program is scanned • Use location counter (LC) CPE 471

  28. Symbol Table (Pass #2) • Symbols in operand replaced with value • Look up symbol in symbol table • If there, replace with value • Else, ERROR CPE 471

  29. Literals • Implicit allocation & initialization of memory • Allows us to put the value itself right in the instruction • Prefice with “=“ • Example: • LD D3,=#16 • This means: • allocate storage at end of program • initialize this storage with the value #16 • use this address in the instruction CPE 471

  30. Literal Table • Pass #1 • literals are identified and placed in the table • “name”, “value”, and “size” fields updated • duplicates can be eliminated • To complete pass #1: • literals are added to the end of the program • “address” field can now be calculated • Pass #2: • literals in instructions are replaced with the • __________ field from the Literal Table • what if the literal is not in the table? CPE 471

  31. Pseudo Operations • Unlike operations, typically do not have machine instruction (opcode) equivalents • Give information to the assembler itself. Another term: assembler directive • Not intended to implement higher level control structures (if-then, loops) • Uses: segment definition, symbol definition, memory initialization, storage allocation • Low level bookeeping easily done by machine CPE 471

  32. Psuedo-op Table • Also a static table • Some lengths are 0, some 1, others? CPE 471

  33. Object File • Header record: contains information on program length, symbols that other modules may be looking for, and the name of this module. • Format: 1 H for header 2-5 Program length in HEX + a space 6-80 List of entry names and locations Entry name (followed by a space). Entry location in HEX 4 columns + a space • The first entry should be the first ORG. CPE 471

  34. Object File II • Text (type T): contains the hex equivalent of the machine instruction. • Format: 1 T for text 2-5 Address in HEX where this text should be placed in memory. 6-80 Instructions and/or data in HEX. • Each word (byte) should be followed by an allocation character to determine the type: • S (Absolute – Does not need modification), R (Relative: needs relocation), and X (external). CPE 471

  35. Object File III • End (type E): indicates the end of the module and where the execution would begin. • Format: • Column 1 E for END 2-5 Execution start address in HEX CPE 471

  36. Information Flow Pass 1/2 CPE 471

  37. Two Pass Assembler:Limitations • Q: Does our 2-pass approach solve all forward-reference problems? • A: no! Something is still broken… CPE 471

  38. Forward Reference Restriction • To avoid this trouble, impose a restriction: • ------------- • What about DS? Is a forward symbol allowed as the operand? Consider: • X EQU Y • Y EQU 0 • Y EQU 0 • X EQU Y CPE 471

  39. Containers • What does this mean • We can't just generate a machine instruction from each assembly instruction -- must save info • We need to start using some data structures! • Table means any data structure or container CPE 471

  40. Absolute Programs • Programmer decides a priori where program will reside • e.g., Prog ORIG $3176 • But memory is shared • concurrent users, concurrent jobs/processes • several programs loaded simultaneously CPE 471

  41. Absolute Programs: Limitation CPE 471

  42. Absolute Programs: Limitation II • Would like the loading to be flexible • decide at load time where it goes! (not at ____________ time) • this decision is made by • __________________ • What the programmer wants: “find a free slot in memory that is big enough to fit this program” CPE 471

  43. Motivating Relocation -Example Prog ORIG 0 X DC.W Y Start LD.L D1,X ST.L D1,Y HLT Y DS.W 1 END Start • In Memory, this program appears as: CPE 471

  44. Motivating Relocation -Example * One slight change Prog ORIG $0100 X DC.W Y Start LD.L R1,X ST.L R1,Y HLT Y DS.L 1 END Start • In Memory, this program appears as: CPE 471

  45. Relocation • The loader must update some (parts of) text records, but not all • after load address has been determined • The assembler does 2 things: • assemble with a load address • tell the loader which parts will need to be updated CPE 471

  46. Modification Records • One approach: define a new record type that identifies the address to be modified • We could add the following record to the object file: M address e.g. M 0004 • Also need to indicate size of quantity being relocated: • Two sizes: 9 bit pgoffset and 16 bit full word • M0000_16 • M0001_9 • M0002_9 • One disadvantage of this approach: • - CPE 471

  47. Alternative: Bit Masks • Use 1 bit / memory cell (or byte) • bit value is 0 means no relocation necessary • bit value is 1 means relocation necessary • Size of relocation data independent of number of records needing modification • but more densely packed (1 bit / text record) • Hard to read (debug, grade,…) CPE 471

  48. Kinds of Data • Our machine has two flavors of data: • relative (to the load address) • absolute • • The first must be modified, the second not • • Let’s look at how these kinds arise… CPE 471

  49. Symbols • Some are relative: • e.g., • Some are absolute: • e.g., • Symbol Table CPE 471

  50. Searching • We must associate a name with a value. • Example: A symbol table is a collection of <key, value> pairs: • Search is given a key, return the corresponding value. • Very important for assemblers. Every line has an instruction (look up in MOT or POT). Lots of symbols (or literals) used. 50% of time searching tables. CPE 471

More Related