1 / 17

Web Based Probabilistic Textual Entailment

Web Based Probabilistic Textual Entailment. Oren Glickman, Ido Dagan and Moshe Koppel Bar Ilan Univ. Classical Entailment Definition. A text t entails an hypothesis h if h is true in every circumstance (possible world) in which t is true i.e., the truth of t implies the truth of h.

collin
Download Presentation

Web Based Probabilistic Textual Entailment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Web Based Probabilistic Textual Entailment Oren Glickman,Ido Dagan and Moshe Koppel Bar Ilan Univ.

  2. Classical Entailment Definition • A text t entails an hypothesis h if h is true in every circumstance (possible world) in which t is true • i.e., the truth of t implies the truth of h

  3. Probabilistic Entailment • Example 312: • (t) Gandhi can be defeated in the next elections in India if between now and 2009, BJP can make Rural India Shine. • (h) Next elections in India will take place in 2009. • tdoes not entail h (in the classical sense) • Then why is it annotated as True?!

  4. Rational • Example 312: • (t) Gandhi can be defeated in the next elections in India if between now and 2009, BJP can make Rural India Shine. • (h) Next elections in India will take place in 2009. • t does add substantial information about the correctness of h • Given that t was stated we’d expect that h is most likely true

  5. A Probabilistic Space • T: The set of all texts • H: The set of all hypotheses • propositional statements which can be assigned a truth value • w: a possible world • truth assignment (to {0=False, 1=True}) for all hypotheses • W - the set of all possible worlds (2H)

  6. A Generative Model We assume a probabilistic generative model: • At each generation event a text is produced along with a (hidden) possible world • based on a probability distribution over T W.

  7. Probabilities • For a given text t and hypothesis h, we consider the following probabilities: • P(Trh=1) = P(h is assigned a truth value of 1) • P(Trh=1| t) = P(h is assigned a truth value of 1 given that the generated text is t)

  8. Textual entailment relationship Definition: • t probabilistically entails h if: • P(Trh = 1| t) > P(Trh= 1) (≡ positive PMI) • t increases the likelihood of h being true

  9. Lexical Entailment • Are the individual terms in h entailed from t • not necessarily holding the right relations • Example #2070: • (t) The Queen of Holland is now owned by Robert Mouawad. • (h) Robert Mouawad is the Queen of Holland.

  10. A Probabilistic Lexical Model • Goal: capture lexical co-occurrence statistics • Assumption 1: Independent lexical truth assignments • Assumption 2: Alignment Iv -- the event that a generated text contains v

  11. Estimating Lexical Entailment Probabilities from the Web • web documents -- sample generated by source • Problem: • Truth assignments not observed • Assumption 3: • Term is true iff appears in document • P(Tru=1|Iv) = P(Iu|Iv) • co-occurrence counts from search engine

  12. Challenge Submission • Tokenize text and remove stop words • Collect counts from AltaVista • Classification: • p = P(Trh = 1| t) • t  h if p > λ ; conf = p • Conf = 1-p for negative examples • λ tuned on dev set

  13. Results

  14. Resulting Alignments • Some good: Japan  Japanese, voter  vote • Some dubious: turnout  half, percent  less

  15. Precision-Recall • High confidence  low precision!!

  16. Did the probs help? Baseline: P(w1|w2) = { 1 w1=w2 ; 0otherwise

  17. Conclusions • Defined probabilistic setting – as needed for modeling probabilistic entailment • Proposing: t probabilistically entails h if it increases the likelihood that h is true • A concrete probabilistic model • incorporating word co-occurrence statistics • based on the proposed setting • The simple model performs as well as more complex systems!

More Related