1 / 17

Compiler Construction

Compiler Construction. Vana Doufexi vdoufexi@cs.northwestern.edu office #317 @ CS dept. Administrative info. class webpage http://www.cs.northwestern.edu/academics/courses/322 contains: news staff information lecture notes & other handouts homeworks & manuals policies, grades

latanya
Download Presentation

Compiler Construction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Compiler Construction Vana Doufexi vdoufexi@cs.northwestern.eduoffice #317 @ CS dept

  2. Administrative info • class webpage • http://www.cs.northwestern.edu/academics/courses/322 • contains: • news • staff information • lecture notes & other handouts • homeworks & manuals • policies, grades • newsgroup portal • useful links

  3. What is a compiler • A program that reads a program written in some language and translates it into a program written in some other language • Modula-2 to C • Java to bytecodes • COOL to MIPS code

  4. Why study compilers? • Application of a wide range of theoretical techniques • Good SW engineering experience • Better understand languages

  5. Features of compilers • Correctness • preserve the meaning of the code • Speed of target code • vs. speed of compilation? • Good use of resources (size, power) • Good error reporting/handling

  6. Compiler structure • Use intermediate representation • Why? Front End Back End IR source code target code

  7. Compiler Structure • Front end • Recognize legal/illegal programs • report/handle errors • Generate IR • The process can be automated • Back end • Translate IR into target code • instruction selection • register allocation • instruction scheduling • lots of NPC problems -- use approximations

  8. Compiler Structure • Optimization: Middle stage • goals • improve running time of generated code • improve space, power consumption, etc. • how? • perform a number of transformations on the IR • multiple passes • important: preserve meaning of code

  9. The Front End • Scanning (a.k.a. lexical analysis) • recognize "words" • Parsing (a.k.a. syntax analysis) • check syntax • Semantic analysis • examine meaning (e.g. type checking) • Other issues: • symbol table (to keep track of identifiers) • error detection/reporting/recovery

  10. The Scanner • Its job: • given a character stream, recognize words (tokens) • e.g. x = 1 becomes IDENTIFIER EQUAL INTEGER • collect identifier information • e.g. IDENTIFIER corresponds to a lexeme (the actual word x) and its type (acquired from the declaration of x). • ignore white space and comments • report errors • Good news • the process can be automated

  11. The Parser • Its job: • Check and verify syntax based on specified syntax rules • e.g. IDENTIFIER LPAREN RPAREN make up an EXPRESSION. • Coming soon: how context-free grammars specify syntax • Report errors • Build IR • often a syntax tree • Good news • the process can be automated

  12. Semantic analysis • Its job: • Check the meaning of the program • e.g. In x=y, is y defined before being used? Are x and y declared? • e.g. In x=y, are the types of x and y such that you can assign one to the other? • Meaning may depend on context • Report errors

  13. IRs • Graphical • e.g. parse tree, DAG • Linear • e.g. three-address code • Hybrid • e.g. linear for blocks of straight-line code, a graph to connect blocks • Low-level or high-level

  14. The scanning process • Main goal: recognize words • How? by recognizing patterns • e.g. an identifier is a sequence of letters or digits that starts with a letter. • Lexical patterns form a regular language • Regular languages are described using regular expressions (REs) • Can we create an automatic RE recognizer? • Yes! (Hold that thought)

  15. The scanning process • Definition: Regular expressions (over alphabet ) •  is an RE denoting {} • If , then  is an RE denoting {} • If r and s are REs, then • (r) is an RE denoting L(r) • r|s is an RE denoting L(r)L(s) • rs is an RE denoting L(r)L(s) • r* is an RE denoting the Kleene closure of L(r) • Property: REs are closed under many operations • This allows us to build complex REs.

  16. The scanning process • Definition: Deterministic Finite Automaton • a five-tuple (, S, , s0, F) where •  is the alphabet • S is the set of states •  is the transition function (SS) • s0 is the starting state • F is the set of final states (F  S) • Notation: • Use a transition diagram to describe a DFA • DFAs are equivalent to REs • Hey! We just came up with a recognizer!

  17. The scanning process • Goal: automate the process • Idea: • Start with an RE • Build a DFA • How? • We can build a non-deterministic finite automaton (Thompson's construction) • Convert that to a deterministic one (Subset construction) • Minimize the DFA (Hopcroft's algorithm) • Implement it • Existing scanner generator: flex

More Related