290 likes | 400 Views
Learning to Transform Natural to Formal Language. Rohit J. Kate, Yuk Wah Wong, and Raymond J. Mooney. Presented by Ping Zhang. Overview. Background SILT CL ANG and G EOQUERY Semantic Parsing using Transformation rules String-based learning Tree-based learning Experiments Future work
E N D
Learning to Transform Natural to Formal Language Rohit J. Kate, Yuk Wah Wong, and Raymond J. Mooney Presented by Ping Zhang
Overview • Background • SILT • CLANG and GEOQUERY • Semantic Parsing using Transformation rules • String-based learning • Tree-based learning • Experiments • Future work • Conclusion
Natural Language Processing (NLP) • Natural Language—human language. • English • The reason to process NL: • To provide a much user-friendly interface • Problems: • NL is too complex. • NL has many ambiguities. • Until now, NL cannot be used to program a computer.
Classification of Language • Traditionally classification (Chomsky Hierarchy) • Regular grammar • Context-free grammar—Formal Language • Context-sensitive grammar • Unrestricted grammar—Natural Language • All programming languages are less flexible than context-sensitive languages currently. • For example, C++ is a restricted context-sensitive language.
An Approach to process NL • Map a natural language to a formal query or command language. • Therefore, NL interfaces to complex computing and AI systems can be more easily developed. EnglishFormal Language Map Compiler Interpreter
Grammar Terms • Grammar: G = (N, T, S, P) • N: finite set of Non-terminal symbols • T: finite set of Terminal symbols • S: Starting non-terminal symbol, S∈N • P: finite set of productions • Production: x->y • For example, • Noun -> “computer” • AssignmentStatement -> i := 10; • Statements -> Statement; Statements
SILT • SILT—Semantic Interpretation by Learning Transformations • Transformation rules Map substrings in NL sentences or subtrees in their corresponding syntactic parse trees to subtrees of the formal-language parse tree. • SILT learns transformation rules from training data—pairs of NL sentences and manual translated formal language statements. • Two target formal languages: • CLANG • GEOQUERY
CLANG • A formal language used in coaching robotic soccer in the RoboCup Coach Competition. • CLANG grammar consists of 37 non-terminals and 133 productions. • All tactics and behaviors are expressed in terms of if-then rules • An example: • ( (bpos (penalty-area our) ) (do (player-except our {4} ) (pos (half our) ) ) ) • “If the ball is in our penalty area, all our players except player 4 should stay in our half.”
GEOQUERY • A database query language for a small database of U.S. geography. • The database contains about 800 facts. • Based on Prolog with meta-predicates augmentations. • An example: • answer(A, count(B, (city(B), loc(B, C), const(C, countryid(usa) ) ),A) ) • “How many cities are there in the US?”
Two methods • String-based transformation learning • Directly maps strings of the NL sentences to the parse tree of formal languages • Tree-based transformation learning • Maps subtrees to subtrees between two languages. • Assumes the syntactic parse tree and parser of the NL sentences are provided
S NP TEAM UNUM VP VBZ has NP DT the NN ball Semantic Parsing • Pattern matching • Patterns found in NL <-> Templates based on productions • NL phrases <-> Formal expression • Rule representation for two methods “TEAM UNUM has the ball” CONDITION →(bowner TEAM {UNUM})
Examples of Parsing • “If our player 4 has the ball, our player 4 should shoot.” • “If TEAM UNUM has the ball, TEAM UNUM should ACTION.” our 4 our 4 (shoot) • “If CONDITION , TEAM UNUM should ACTION.” (bowner our {4}) our 4 (shoot) • “If CONDITION , DIRECTIVE .” (bowner our {4}) (do our {4} (shoot) ) • RULE( (bowner our {4}) (do our {4} (shoot) ))
Variations of Rule Representation • SILT allows patterns to skip some words or nodes • “if CONDITION, <1> DIRECTIVE.” <1> -> ”then” • To deal with non-compositionality • SILT allows to apply constrains • “in REGION” matches “CONDITION -> (bpos REGION)” if “in REGION” follows “the ball <1>”. • SILT allows to use templates with multi productions • “TEAM player UNUM has the ball in REGION” CONDITION → (and (bowner TEAM UNUM) (bpos REGION))
Input: A training set T of NL sentences paired with formal representations; • a set of productions in the formal grammar • Output: A learned rule base L • Algorithm: • Parse all formal representations in T using . • Collect positive P and negative examples N for all ∈ . • L = ∅ • Until all positive examples are covered, or no more good rules • can be found for any ∈ , do: • R’ = FindeBestRules( ,P,N) • L = L ∪ R’ • Apply rules in L to sentences in T. • Given a NL sentence S: • P: if is used in the formal expression of S, then S is positive to • N: if is not used in the formal expression of S, then S is negative to Learning Transformation Rules
Issues of SILT Learning • Non-compositionality • Rule cooperation • Rules are learn in order. • Therefore an over-general ancestor will lead to a group of over-general child rules. Further, no rule can cooperate with that kind of rules. • Two approaches can solve: • Find the single best rule for all competing productions in each iteration. • Over generate rules; then find a subset which can cooperate
FindBestRule() For String-based Learning Input: A set of productions in the formal grammar; sets of positive P and negative examples N for each in Output: The best rule BR Algorithm: R = ∅ For each production π∈ Π : Let Rπ be the maximally-specific rules derived from P. Repeat for k = 1000 times: Choose r1, r2 ∈ Rπ at random. g = GENERALIZE(r1, r2, π) Add g to R. R = R ∪ R BR = argmax r ∈ R goodness(r) Remove positive examples covered by BR from P .
FindBestRule() Cont. • Goodness (r) • GENERALIZE • r1, r2 : two transformation rules based on the same production • For example: • π : Region -> (penalty-area TEAM) • pattern 1: TEAM ‘s penalty box • pattern 2: TEAM penalty area • Generalization: TEAM <1> penalty
NP NP NN area NP TEAM NN penalty NN box PRP$ TEAM NN penalty NP , TEAM TEAM POS ‘s NN NN penalty Tree-based Learning • Similar FindBestRules() algorithm • GENERALIZE • Find the largest common subgraphs of two rules. • For example: • π : Region -> (penalty-area TEAM) Pattern 1 Pattern 2 Generalization
Experiment • As for CLANG • 300 pieces selected randomly from log files of 2003 RoboCup Coach Competition. • Each formal instruction was translated into English by human. • Average length of a NL sentence is 22.52 words. • As for GEOQUERY • 250 questions were collected from undergraduate students. • All English queries were translated manually. • Average length of a NL sentence is 6.87 words.
Time Consuming Time consuming in minutes.
Future Work • Though improved, SILT still lacks robustness of statistical parsing. • The hard-matching symbolic rules of SILT are sometimes too brittle. • A more unified implementation of tree-based SILT which allows to directly compare and evaluate the benefit of using initial syntactic parsers.
Conclusion • A novel approach, SILT, can learn transformation rules that maps NL sentences into a formal language. • It shows better overall performance than previous approaches. • NLP, still a long way to go.
Thank you! Questions or comments?