Textual Entailment as a Framework for Applied Semantics

Textual Entailment as a Framework for Applied Semantics Ido Dagan Bar-Ilan University, Israel Joint works with: Oren Glickman, Idan Szpektor, Roy Bar Haim, Maayan Geffet, Moshe Koppel, Efrat Marmorshtein, Bar Ilan UniversityShachar Mirkin Hebrew University, Israel Hristo Tanev, Bernardo Magnini, Alberto Lavelli, Lorenza Romano ITC-irst, Italy Bonaventura Coppola, Milen Kouylekov University of Trento and ITC-irst, Italy Danilo Giampiccolo, CELCT, Italy Dan Roth, UIUC

Applied Semantics forText Understanding/Reading • Understanding text meaning refers to the semantic level of language • An applied computational framework for semantics is needed • Such common framework is still missing

Desiderata for Modeling Framework • A framework for a target level of language processing should provide: • Generic module for applications • Unified paradigm for investigating language phenomena • Unified knowledge representation • Most semantics research is scattered • WSD, NER, SRL, lexical semantics relations… (e.g. vs. syntax) • Dominating approach - interpretation

Outline • The textual entailment task – what and why? • Evaluation – PASCAL RTE Challenges • Modeling approach: • Knowledge acquisition • Inference (briefly) • Application example • An alternative framework for investigating semantics

Variability Ambiguity Natural Language and Meaning Meaning Language

Variability of Semantic Expression The Dow Jones Industrial Average closed up 255 Model variabilityas relations between text expressions: • Equivalence: expr1  expr2 (paraphrasing) • Entailment: expr1  expr2 – the general case • Incorporates inference as well Dow ends up Dow gains 255 points Stock market hits a record high Dow climbs 255

Typical Application Inference QuestionExpected answer formWhoboughtOverture? >> XboughtOverture Overture’s acquisitionby Yahoo Yahoo bought Overture entails hypothesized answer text • Similar for IE: X buy Y • Similar for “semantic” IR: t: Overture was bought … • Summarization (multi-document) – identify redundant info • MT evaluation (and recent ideas for MT) • Educational applications

KRAQ'05 Workshop - KNOWLEDGE and REASONING for ANSWERING QUESTIONS (IJCAI-05) CFP: • Reasoning aspects: * information fusion, * search criteria expansion models * summarization and intensional answers, * reasoning under uncertainty or with incomplete knowledge, • Knowledge representation and integration: * levels of knowledge involved (e.g. ontologies, domain knowledge), * knowledge extraction models and techniques to optimize response accuracy… but similar needs for other applications – can entailment provide a common empirical task?

Classical Entailment Definition • Chierchia & McConnell-Ginet (2001):A text t entails a hypothesis h if h is true in every circumstance (possible world) in which t is true • Strict entailment - doesn't account for some uncertainty allowed in applications

“Almost certain” Entailments t:The technological triumph known as GPS … was incubated in the mind of Ivan Getting. h: Ivan Getting invented the GPS.

Applied Textual Entailment • Directional relation between two text fragments: Text (t) and Hypothesis (h): • Operational (applied) definition: • Human gold standard - as in NLP applications • Assuming common background knowledge – which is indeed expected from applications!

Probabilistic Interpretation Definition: • t probabilistically entailshif: • P(h istrue | t) > P(h istrue) • tincreases the likelihood of h being true • ≡ Positive PMI – t provides information on h’s truth • P(h istrue | t ):entailment confidence • The relevant entailment score for applications • In practice: “most likely” entailment expected

The Role of Knowledge • For textual entailment to hold we require: • text AND knowledgeh but • knowledge should not entail h alone • Systems are not supposed to validate h’s truth without utilizing t

PASCAL Recognizing Textual Entailment (RTE) ChallengesEU FP-6 Funded PASCAL NOE 2004-7 Bar-Ilan University ITC-irst and CELCT, Trento MITRE Microsoft Research

Generic Dataset by Application Use • 7 application settings in RTE-1, 4 in RTE-2/3 • QA • IE • “Semantic” IR • Comparable documents / multi-doc summarization • MT evaluation • Reading comprehension • Paraphrase acquisition • Most data created from actual applications output • RTE-2: 800 examples in development and test sets • 50-50% YES/NO split

Some Examples

Participation and Impact • Very successful challenges, world wide: • RTE-1 – 17 groups • RTE-2 – 23 groups • 30 groups in total • ~150 downloads! • RTE-3 underway – 25 groups • Joint workshop at ACL-07 • High interest in the research community • Papers, conference sessions and areas, PhD’s, influence on funded projects • Textual Entailment special issue at JNLE • ACL-07 tutorial

Methods and Approaches (RTE-2) • Measure similarity match between t and h (coverage of h by t): • Lexical overlap (unigram, N-gram, subsequence) • Lexical substitution (WordNet, statistical) • Syntactic matching/transformations • Lexical-syntactic variations (“paraphrases”) • Semantic role labeling and matching • Global similarity parameters (e.g. negation, modality) • Cross-pair similarity • Detect mismatch (for non-entailment) • Logical interpretation and inference (vs. matching)

Dominant approach: Supervised Learning • Features model similarity and mismatch • Classifier determines relative weights of information sources • Train on development set and auxiliary t-h corpora Similarity Features:Lexical, n-gram,syntactic semantic, global Classifier YES t,h NO Feature vector

Results Average: 60% Median: 59%

Analysis  • For the first time: deeper methods (semantic/ syntactic/ logical) clearly outperform shallow methods (lexical/n-gram) Cf. Kevin Knight’s invited talk at EACL-06, titled: Isn’t linguistic Structure Important, Asked the Engineer • Still, most systems based on deep analysis did not score significantly better than the lexical baseline

Why? • System reports point at: • Lack of knowledge (syntactic transformation rules, paraphrases, lexical relations, etc.) • Lack of training data • It seems that systems that coped better with these issues performed best: • Hickl et al. - acquisition of large entailment corpora for training • Tatu et al. – large knowledge bases (linguistic and world knowledge)

Some suggested research directions • Knowledge acquisition • Unsupervised acquisition of linguistic and world knowledge from general corpora and web • Acquiring larger entailment corpora • Manual resources and knowledge engineering • Inference • Principled framework for inference and fusing information levels • Are we happy with bags of features?

Complementary Evaluation Modes • Entailment subtasks evaluations • Lexical, lexical-syntactic, logical, alignment… • “Seek” mode: • Input: h and corpus • Output: All entailing t’s in corpus • Captures information seeking needs, but requires post-run annotation (TREC style) • Contribution to specific applications! • QA – Harabagiu & Hickl, ACL-06; RE – Romano et al., EACL-06

Our Own Research DirectionsAcquisitionInferenceApplications

Learning Entailment Rules Q: What reduces the risk of Heart Attacks? Hypothesis:Aspirinreduces the risk ofHeart Attacks Text:Aspirin prevents Heart Attacks Entailment Rule:XpreventY ⇨ Xreduce risk ofY template template Need a large knowledge base of entailment rules

TEASE – Algorithm Flow Lexicon Input template: Xsubj-accuse-objY WEB TEASE Sample corpus for input template: Paula Jones accused Clinton… Sanhedrin accused St.Paul… … Anchor Set Extraction(ASE) Anchor sets: {Paula Jonessubj; Clintonobj} {Sanhedrinsubj; St.Paulobj} … Template Extraction (TE) Sample corpus for anchor sets: Paula Jones called Clinton indictable… St.Paul defendedbefore the Sanhedrin … Templates: X call YindictableY defend before X… iterate

Sample of ExtractedAnchor-Sets for X prevent Y

Sample of Extracted Templates for X preventY

Experiment and Evaluation • 48 randomly chosen input verbs • 1392 templates extracted ; human judgments Encouraging Results: • Future work: precision, estimate probabilities

Acquiring Lexical Entailment Relations • COLING-04, ACL-05Lexical entailment via distributional similarity • Individual features characterize semantic properties • Obtain characteristic features via bootstrapping • Test characteristic feature inclusion (vs. overlap) • COLING-ACL-06Integrate pattern-based extraction • NP such as NP1, NP2, … • Complementary information to distributional evidence • Integration using ML with minimal supervision (10 words)

Acquisition Example • Top-ranked entailments for “company”: • firm, bank, group, subsidiary, unit, business, • supplier, carrier, agency, airline, division, giant, • entity, financial institution, manufacturer, corporation, • commercial bank, joint venture, maker, producer, factory … • Does not overlap traditional ontological relations

Initial Probabilistic Lexical Co-occurrence Models • Alignment-based (RTE-1 & ACL-05 Workshop) • The probability that a term in h is entailed by a particular term in t • Bayesian classification (AAAI-05) • The probability that a term in h is entailed by (fits in) the entire text of t • An unsupervised text categorization setting – each term is a category • Demonstrate directions for probabilistic modeling and unsupervised estimation

rel  rel N1 N2 conj mod and N2 Manual Syntactic Transformations Example: ‘X preventY ’ • Sunscreen, which prevents moles and sunburns, …. sunscreen prevent obj subj Y X which subj prevents obj () moles mod conj and sunburns

Syntactic Variability Phenomena Template: X activate Y

Takeout • Promising potential for creating huge entailment knowledge bases • Mostly by unsupervised approaches • Manually encoded • Derived from lexical resources • Potential for uniform representations, such as entailment rules, for different types of semantic and world knowledge

Inference • Goal: infer hypothesis from text • Match and apply available entailment knowledge • Heuristically bridge inference gaps • Our approach: mapping language constructs • Vs. semantic interpretation • Lexical-syntactic structures as meaning representation • Amenable for unsupervised learning • Entailment rule transformations over syntactic trees

Application:UnsupervisedRelation ExtractionEACL 2006

Relation Extraction • Subfield of Information Extraction • Identify differentwaysof expressing a target relation • Examples: Management Succession, Birth - Death, Mergers and Acquisitions, Protein Interaction • Traditionally performed in a supervised manner • Requires dozens-hundreds examples per relation • Examples should cover broad semantic variability • Costly - Feasible??? • Little work on unsupervised approaches

Our Goals Entailment Approach for Relation Extraction Unsupervised Relation Extraction System Evaluation Framework for Entailment Rule Acquisition and Matching

Proposed Approach Input Template X prevent Y Entailment Rule Acquisition TEASE Templates X prevention for Y, X treat Y, X reduce Y TransformationRules Syntactic Matcher Relation Instances <sunscreen, sunburns>

Dataset • Bunescu 2005 • Recognizing interactions between annotated proteins pairs • 200 Medline abstracts • Gold standard dataset of protein pairs • Input template : X interact with Y

Manual Analysis - Results • 93% of interacting protein pairs can be identified with lexical syntactic templates Number of templates vs. recall (within 93%): Frequency of syntactic phenomena:

TEASE Output for X interact with Y A sample of correct templates learned:

TEASE algorithm - Potential Recall on Training Set • Iterative - taking the top 5 ranked templates as input • Morph - recognizing morphological derivations(cf. semantic role labeling vs. matching)

Results for Full System Error sources: • Dependency parser and syntactic matching errors • No morphological derivation recognition • TEASE limited precision (incorrect templates)

Vs Supervised Approaches • 180 training abstracts

Textual Entailment as a Framework for Applied Semantics

Textual Entailment as a Framework for Applied Semantics

Presentation Transcript

Knowledge Representation and Inference Models for Textual Entailment

Textual Entailment: A Perspective on Applied Text Understanding

Recognizing Textual Entailment

From Textual Entailment to Knowledgeable Machines

Recognizing Textual Entailment using UNL framework

Textual Entailment, QA4MRE, and Machine Reading

Third Recognizing Textual Entailment Challenge

Textual Entailment

Recognizing Textual Entailment Challenge PASCAL

Normalized alignment of dependency trees for detecting textual entailment

Textual Entailment

Relation Alignment for Textual Entailment Recognition

Baselines for Recognizing Textual Entailment

Web Based Probabilistic Textual Entailment

Tree Mining and Textual Entailment

Textual Entailment as Syntactic Graph Distance: a rule based and a SVM based approach

Textual entailment inference in machine translation

Assessing the Impact of Frame Semantics on Textual Entailment

Recognizing Textual Entailment using the UNL framework

CS626-449: Lecture 29 Recognizing Textual Entailment using the UNL framework

Textual entailment inference in machine translation

Using Maximal Embedded Subtrees for Textual Entailment Recognition