1 / 15

More yacc

More yacc. What is yacc. Tool to produce a parser given a grammar YACC (Yet Another Compiler Compiler) is a program designed to compile a LALR(1) grammar and to produce the source code of the syntactic analyzer of the language produced by this grammar

katoka
Download Presentation

More yacc

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. More yacc

  2. What is yacc • Tool toproduce a parser given a grammar • YACC (Yet Another Compiler Compiler) is a program designed to compile a LALR(1) grammar and to produce the source code of the syntactic analyzer of the language produced by this grammar • Input is a grammar (rules) and actions to take upon recognizing a rule • Output is a C program and optionally a header file of tokens

  3. Works with lex • Lex is a scanner generator • Input is description of patterns and actions • Output is a C program which contains a function yylex() which, when called, matches patterns and performs actions per input • Typically, the generated scanner performs lexical analysis and produces tokens for the (YACC-generated) parser

  4. Structure of a YACC File • Has the same three-part structure as Lex • Each part is separated by a %% symbol • The three parts are even identical: • definition section • rules section • code section (copied directly into the generated program)

  5. Definition Section • Declare tokens used in the grammar and types of values used on the stack here • Tokens that are single quoted characters like “=“ or “+” need not be declared. • Literal C code can be included in a block in this section using %{…%}

  6. Declaring Tokens • The tokens that are used in the grammar must be declared • Include lines like the one below in the definition section: %token CHARSTRING INT IDENTIFIER %token LPAREN RPAREN

  7. The Rules Section • The rules of the grammar are placed here. • Here is an example of the basic syntax: Expr  INTEGER + INTEGER | INTEGER - INTEGER expr : INTEGER + INTEGER {action} | INTEGER – INTEGER {action} ; YACC grammar definition

  8. YACC Actions • Simiar to Lex, actions can be defined that will be performed whenever a production is applied in the stream of tokens. • These are usually included after the production whose action is to be defined. • Since every symbol in the grammar has a corresponding value, it will be necessary to access those values. • Accessing the YACC stack will be the way to do this.

  9. Accessing the Stack • Since YACC generates an LR parser, it will push the symbols that it reads along with their values on a stack until it is ready to reduce. • To access these values, include a dollar sign with a number to get at each value in the production in the action definition.

  10. Refers to the value of the left nonterminal Accessing the Stack expr : INTEGER + INTEGER {$$ = $1 + $3} | INTEGER – INTEGER {$$ = $1 - $3} ;

  11. Tokens and values come from lex LEX YACC yyparse yylex

  12. Revisiting Lex • The Lex file will have to be modified to work with the YACC parser in two main places. • In the definition section, include this statement: #include “y.tab.h” • That is a header file automatically created by YACC when the parser is generated. • The actions for the rules need to be changed too.

  13. Revisiting Lex Actions • For tokens with a value, assign that value to yylval. YACC can read the value from that variable. • Include a return statement for the token name (this is the same name that is defined at the top of the YACC file). if {return IF;} [1-9][0-9]* {yylval = atoi(yytext); return INTEGER;}

  14. The %union Declaration • Different tokens have different data types. • INTEGER are integers, FLOAT are floats, CHARACTERSTRING are char *, IDENTIFIER are pointers to the entry in the symbol table for that identifier. • The %union will allow the parser to apply the right data type to the right token.

  15. The %union Declaration YACC Definition Section %union { intintValue; float floatValue; } %token <intValue> INTEGER %token <floatValue> FLOAT Lex Rules Section … {yylval.intValue = atoi(yytext); return INTEGER;} … {yylval.floatValue = atof(yytext); return FLOAT;}

More Related