1 / 27

Lambert Schomaker

KI2 – 5 Grammar inference. Lambert Schomaker. Kunstmatige Intelligentie / RuG. Grammar inference (GI). methods, aimed at uncovering the grammar which underlies an observed sequence of tokens Two variants: explicit, formal GI deterministic token

Download Presentation

Lambert Schomaker

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. KI2 – 5 Grammar inference Lambert Schomaker Kunstmatige Intelligentie / RuG

  2. Grammar inference (GI) • methods, aimed at uncovering the grammar which underlies an observed sequence of tokens • Two variants: • explicit, formal GI deterministic token generators • implicit, statistical GI stochastic token generators

  3. Grammar inference AABBCCAA..(?).. what’s next? ABA  1A 1B 1A AABBAA  2A 2B 2A or  AAB(+mirrorsymmetric)  (2A B)(+mirrored) repetition, mirrorring, insertion, substitution

  4. Strings of tokens • DNA: ACTGAGGACCTGAC… • output of speech recognizers • words from an unknown language • tokenized patterns in the real world

  5. Strings of tokens • DNA: ACTGAGGACCTGAC… • output of speech recognizers • words from an unknown language • tokenized patterns in the real world A B A

  6. Strings of tokens • DNA: ACTGAGGACCTGAC… • output of speech recognizers • words from an unknown language • tokenized patterns in the real world A B A  Symm(B,A)

  7. GI • induction of structural patterns from observed data • representation by a formal grammar versus: • emulating the underlying grammar withoutmaking the rules explicit (NN,HMM)

  8. GI, the engine Grammar Induction Data Grammatical rules (seq (repeat 3 a)(repeat 3 b)) (seq a b) (symmetry (repeat 2 c) (seq a b)) aaabbb ab abccba

  9. The hypothesis behind GI Grammar Induction Generator process Data G’ G0 aaabbb ab abccba Find G’  G0

  10. The hypothesis behind GI Grammar Induction Generator process Data G’ G0 aaabbb ab abccba Find G’  G0 It is not claimed that G0 actually ‘exists’

  11. Learning • Until now it was implicitly assumed that the data consists of positive examples • A very large amount of data is needed to induce an underlying grammar • It is difficult to find a good approximation to G0 if there are no negative examples: e.g. “aaxybb does NOT belong to the grammar”

  12. Learning… Convergence G0 = G* is assumed for infinite N Grammar Induction Generator process Data G’ G0 sample1 G1 sample2 G1+2 sample3 G1+2+3 . . . sampleN G*

  13. Learning… (Convergence G0 = G* is assumed for infinite N) More realistic: a PAC, probably approximately correct G* Grammar Induction Generator process Data G’ G0 sample1 G1 sample2 G1+2 sample3 G1+2+3 . . . sampleN G*

  14. PAC GI the language generated by G0 the language explained by G* L(G0) L(G*) P[ p(L(G0)  L(G*)) <  ] > (1 - )

  15. PAC GI the language generated by G0 the language explained by G* L(G0) L(G*) The probability that “the probability of finding elements {L0xor L*} is smaller than ”, will be larger than 1-  P[ p(L(G0)  L(G*)) <  ] > (1 - )

  16. Example • S+ = { aa, aba, abba, abbba} a a  aa a

  17. Example • S+ = { aa, aba, abba, abbba} a a  aa a b a ab ba

  18. Example • S+ = { aa, aba, abba, abbba} a a  aa a b a ab ba bb b a

  19. Example • S+ = { aa, aba, abba, abbba} a a  aa a b a ab ba bb b a b

  20. Many GI approaches are known (Dupont, 1997)

  21. Second group: Grammar Emulation • Statistical methods, aiming at producing token sequences with the same statistical properties as the generator grammar G0 • 1: recurrent neural networks • 2: Markov models • 3: hidden-Markov models

  22. Grammar emulation, training ABGBABGACTVYAB <x>. . . predict x context window Grammar emulator

  23. Recurrent neural networks for grammar emulation • Major types: • Jordan (output-layer recurrence) • Elman (hidden-layer recurrence)

  24. Jordan MLPs • Assumption: current state is represented by output unit activation at the previous time step(s) and by the current input Input state Output t

  25. Elman MLPs • Assumption: current state is represented by hidden unit activation at the previous time step(s) and by the current input Input state Output t

  26. Markov variants • Shannon: fixed 5-letter window for English to predict next letter • Variable-length Markov Models (VLMM) (Guyon & Pereira) idea: the width of the context window to predict the next token in a sequence is variable and depends on statistics

  27. Results • Example output of letter VLMM, trained on news item texts (250 MB training set) “liferator member of flight since N. a report the managical including from C all N months after dispute. C and declaracter leaders first to do a lot of though a ground out and C C pairs due to each planner of the lux said the C nailed by the defender begin about in N. the spokesman standards of the arms responded victory the side honored by the accustomers was arrest two mentalisting the romatory accustomers of ethnic C C the procedure. “

More Related