1 / 23

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 20– Parsing)

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 20– Parsing). Pushpak Bhattacharyya CSE Dept., IIT Bombay 28 th Feb, 2011. Need for Parsing. Sentences are linear structures, on the face of it Is that the right view?

aquila
Download Presentation

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 20– Parsing)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS460/626 : Natural Language Processing/Speech, NLP and the Web(Lecture 20– Parsing) Pushpak BhattacharyyaCSE Dept., IIT Bombay 28th Feb, 2011

  2. Need for Parsing • Sentences are linear structures, on the face of it • Is that the right view? • Is there a hierarchy- a tree- hidden behind the linear structure? • Is there a principle in branching • What are the constituents and when should the constituent give rise to children? • What is the hierarchy building principle?

  3. Deeper trees needed for capturing sentence structure This wont do! NP PP The AP PP book with the blue cover of poems big [The big book of poems with the Blue cover] is on the table.

  4. PPs are at the same level: flat with respect to the head word “book” No distinction in terms of dominance or c-command NP PP The AP PP book with the blue cover of poems big [The big book of poems with the Blue cover] is on the table.

  5. “Constituency test of Replacement” runs into problems • One-replacement: • I bought the big [book of poems with the blue cover] not the small [one] • One-replacement targets book of poems with the blue cover • Another one-replacement: • I bought the big [book of poems] with the blue cover not the small [one] with the red cover • One-replacement targets book of poems

  6. More deeply embedded structure NP N’1 The AP N’2 N’3 big PP N book with the blue cover PP of poems

  7. To target N1’ • I want [NPthis [N’big book of poems with the red cover] and not [Nthat [None]]

  8. Other languages English NP PP The AP PP book with the blue cover big of poems NP Hindi AP PP PP kitaab kavita kii niil jilda vaalii badii [niil jilda vaalii kavita kii kitaab]

  9. Other languages: contd English NP PP The AP PP book with the blue cover big of poems NP Bengali AP PP PP ti bai kavitar niil malaat deovaa motaa [niil malaat deovaa kavitar bai ti]

  10. Grammar and Parsing Algorithms

  11. A simplified grammar • S  NP VP • NP  DT N | N • VP  V ADV | V

  12. A segment of English Grammar • S’(C) S • S{NP/S’} VP • VP(AP+) (VAUX) V (AP+) ({NP/S’}) (AP+) (PP+) (AP+) • NP(D) (AP+) N (PP+) • PPP NP • AP(AP) A

  13. Example Sentence People laugh • 2 3 Lexicon: People - N, V Laugh - N, V These are positions This indicate that both Noun and Verb is possible for the word “People”

  14. Top-Down Parsing State Backup State Action ----------------------------------------------------------------------------------------------------- 1. ((S) 1) - - 2. ((NP VP)1) - - 3a. ((DT N VP)1) ((N VP) 1) - 3b. ((N VP)1) - - 4. ((VP)2) - Consume “People” 5a. ((V ADV)2) ((V)2) - 6. ((ADV)3) ((V)2) Consume “laugh” 5b. ((V)2) - - 6. ((.)3) - Consume “laugh” Termination Condition : All inputs over. No symbols remaining. Note: Input symbols can be pushed back. Position of input pointer

  15. Discussion for Top-Down Parsing • This kind of searching is goal driven. • Gives importance to textual precedence (rule precedence). • No regard for data, a priori (useless expansions made).

  16. Bottom-Up Parsing Some conventions: N12 S1? -> NP12 ° VP2? Represents positions Work on the LHS done, while the work on RHS remaining End position unknown

  17. Bottom-Up Parsing (pictorial representation) S -> NP12 VP23° People Laugh 1 2 3 N12 N23 V12 V23 NP12 -> N12 ° NP23 -> N23 ° VP12 -> V12 ° VP23 -> V23 ° S1? -> NP12° VP2?

  18. Problem with Top-Down Parsing • Left Recursion • Suppose you have A-> AB rule. Then we will have the expansion as follows: • ((A)K) -> ((AB)K) -> ((ABB)K) ……..

  19. Combining top-down and bottom-up strategies

  20. Top-Down Bottom-Up Chart Parsing • Combines advantages of top-down & bottom-up parsing. • Does not work in case of left recursion. • e.g. – “People laugh” • People – noun, verb • Laugh – noun, verb • Grammar – S  NP VP NP  DT N | N VP  V ADV | V

  21. Transitive Closure People laugh 1 2 3 S NP VP NP N VP  V  NP DT N S  NPVP S  NP VP  NP N VP V ADV success VP V

  22. Arcs in Parsing • Each arc represents a chart which records • Completed work (left of ) • Expected work (right of )

  23. Example People laugh loudly 1 2 3 4 S  NP VP NP  N VP  V VP  V ADV NP  DT N S  NPVP VP  VADV S  NP VP NP  N VP V ADV S  NP VP VP V

More Related