1 / 13

Chapter 3 Context-Free Grammars and Parsing

Chapter 3 Context-Free Grammars and Parsing. Gang S. Liu College of Computer Science & Technology Harbin Engineering University. Introduction. Parsing is the task of determining the syntax, or structure, of a program. It is also called syntax analysis .

Download Presentation

Chapter 3 Context-Free Grammars and Parsing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 3 Context-Free Grammars and Parsing Gang S. Liu College of Computer Science & Technology Harbin Engineering University Samuel2005@126.com

  2. Introduction • Parsing is the task of determining the syntax, or structure, of a program. • It is also called syntax analysis. • The syntax of a programming language is usually given by the grammar rules of a context-free grammar. • The rules of context-free grammar are recursive. • Data structures representing the syntactic structure are also recursive – a parse tree or syntax tree. Samuel2005@126.com

  3. The Parsing Process • Usually, the sequence of tokens is not an explicit input parameter, but the parser calls a scanner procedure such as getToken to fetch the next token from the input as it is needed during the parser process. parser Sequence of tokens Syntax tree Samuel2005@126.com

  4. Context-Free Grammars • A context-free grammar is a specification for the syntactic structure of a programming language. • Context-free grammar involves recursive rules. • Example: • integer arithmetic expressions with additions, subtraction, and multiplication operations • exp → exp op exp | (exp) | number • op → + | - | * Samuel2005@126.com

  5. BNF • exp → exp op exp | (exp) | number • op → + | - | * • Names are written in italic. • |- metasymbol for choice. • Concatenation is used as a standard operation. • No repetitions. • → is used to express the definitions of names. • Regular expressions are used as components. • The notation was developed by John Backus and adapted by Peter Naur. • The grammar rules in this form are said to be in Backus-Naur Form, or BNF. Samuel2005@126.com

  6. Context-Free Grammar Rules • Grammar rules are defined over an alphabet, or set of symbols. • The symbols are usually tokens representing strings of characters. • Context-free grammar rule consists of a string of symbols • Name for a structure. • Metasymbol →. • A string of symbols • Either a symbol from the alphabet • Or a name for a structure • Or metasymbol | • exp → exp op exp | (exp) | number • op → + | - | * Samuel2005@126.com

  7. Context-Free Grammar Rules (cont) • The rule defines the structure whose name is to the left of the arrow. • The structure is defined to consist of one of the choices on the right-hand side separated by the vertical bars. exp → exp op exp | (exp) | number op → + | - | * Samuel2005@126.com

  8. Legal String? • (34 – 3) * 42corresponds to the legal string of tokens (number – number) * number • (34 – 3 * 2 is not legal expression exp → exp op exp | (exp) | number op → + | - | * Samuel2005@126.com

  9. Derivations • Grammar rules determine the legal strings of tokens by means of derivations. • Derivation is a sequence of replacements of structure names by choices on the right-hand sides of grammar rules. • Derivation begins with a single structure name and ends with a string of token symbols. exp → exp op exp | (exp) | number op → + | - | * Samuel2005@126.com

  10. Example of Derivation exp → exp op exp | (exp) | number op → + | - | * (34 –3) * 42 • exp=> exp op exp • => exp op number • => exp * number • => (exp)* number • =>(exp op exp)* number • => (exp op number)* number • => (exp –number)* number • => (number–number) * number • => (34–3) * 42 Grammar rules define → Derivation steps construct by replacement => Samuel2005@126.com

  11. Terminology • The set of all strings of token symbols obtained by derivations is the language defined by the grammar. • Grammar rules are called productions, they produce the legal strings of the language via derivations. • The first rule is called the start symbol. • Structure names are called nonterminals. • They are to be replaced, do not terminate the derivation. • Symbols in the alphabet are called terminals. • They terminate a derivation. Samuel2005@126.com

  12. Example 3.1 • Let G be a grammar defined by the rule E → ( E ) | a • This grammar has one nonterminal E and three terminals ( , ) , and a. • This grammar generates language: • L(G) = { a, (a), ((a)), (((a))), …} • Derivation for ((a)) • E => (E) • => ((E)) • => ((a)) Samuel2005@126.com

  13. Example 2.3 • Σ = {a, b} • Consider a set of strings consisting of a single b surrounded by the same number of a’s. S = {b, aba, aabaa, aaabaaa, …} • Regular expressiondoes not work. • This set of strings cannot be described by a regular expression. • This can be proved using a famous theorem called the pumping lemma. a*ba* Samuel2005@126.com

More Related