1 / 17

Summarization using Event Extraction Base System

Summarization using Event Extraction Base System . 01/12 KwangHee Park . Research goal. Summarize the article by categorize the subject of article Not just extract key sentence but rearrange the sentence by subject of event Easily understand what happen each subject. Research goal.

aisha
Download Presentation

Summarization using Event Extraction Base System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Summarization using Event ExtractionBase System 01/12 KwangHee Park

  2. Research goal • Summarize the article by categorize the subject of article • Not just extract key sentence but rearrange the sentence by subject of event • Easily understand what happen each subject

  3. Research goal • Extract event and rearrange them by subject • The north • Launched 170 artillery shells • Used both direct-firing guns and howitzers • … • South Korean forces • Fired back only 80 shells • … • South Korean marines • First evacuated to safe places • … Summarization from raw text

  4. Architecture On the other hand, it’s turning out to be another very bad financial week for Asia. The financial assistance from the World Bank and the International Monetary Fund are not helping. In the last twenty four hours, the value of the Indonesian stock market has fallen by twelve percent. The Indonesian currency has lost twenty six percent of its value. In Singapore, stocks hit a five year low. In the Philippines, a four year low. And in Hong Kong, a three percent drop. More in Hong Kong for a place, for an economy, that many experts thought was once invincible Raw text Event recognizer Subject assigner Categorizer

  5. Architecture On the other hand, it’s turning out to be another very bad financial week for Asia. The financial assistance from the World Bank and the International Monetary Fund are not helping. In the last twenty four hours, the value of the Indonesian stock market has fallen by twelve percent. The Indonesian currency has lost twenty six percent of its value. In Singapore, stocks hit a five year low. In the Philippines, a four year low. And in Hong Kong, a three percent drop. More problems in Hong Kong for a place, for an economy, that many experts thought was once invincible Raw text Event recognizer Subject assigner Categorizer

  6. Architecture On the other hand, it’s turning out to be another very bad financial week for Asia. The financial assistance from the World Bank and the International Monetary Fund are not helping. In the last twenty four hours, the value of the Indonesian stock market has fallen by twelve percent. The Indonesian currency has lost twenty six percent of its value. In Singapore, stocks hit a five year low. In the Philippines, a four year low. And in Hong Kong, a three percent drop. More problems in Hong Kong for a place, for an economy, that many experts thought was once invincible Raw text Event recognizer Subject assigner Categorizer

  7. Architecture Indonesian stock market Fallen by twelve percent Indonesian currency Lost twenty six percent Singapore stock Five year low The Philippines stocks Four year low Hong Kong stock Three percent drop Raw text Event recognizer Subject assigner Categorizer

  8. Event Extraction • Event • An instance of a topic identified at document level describing something that happen • Event extraction • Extract event with their argument from the text • Example : • The Nasdaq Financial index lost about 1%,or 3.95, to 448.80. • <s>The <ENAMEX TYPE="ORGANIZATION">Nasdaq Financial Index</ENAMEX> <EVENT eid="e229" class="OCCURRENCE" >lost</EVENT> about <NUMEX TYPE="PERCENT">1%</NUMEX>, or 3.95, <SIGNAL sid="s364" >to</SIGNAL> 448.80.</s>

  9. Event recognizer • Recognize whether the word is used as event or not • The Nasdaq Financial Index lostabout 1%, or 3.95, to 448.80. • The Nasdaq Financial Index <EVENT>lost</EVENT> about 1%, or 3.95, to 448.80. • In this example, only the word ‘lost’ is used as event word.

  10. Event recognizer • Rule-based recognition • Training Feature • POS tag only • Any verb pos tagged word except be verb and have verb • Word dependency with POS tag – standard Stanford word dependency • 55 number of grammatical binary relations. • Bi-gram POS tagged context

  11. Experiment • Corpus • Timebank 1.1 annotated corpus • 176 number of document • 2603 number of sentences • 7168 number of events • Use • Stanford parser • Stanford POS tagger • 3-fold cross validation

  12. Result

  13. Subject assigner • Select Subject of given event word or phase • Subject means the main agent of given event • Step1 • Make set of candidate subject • Step2 • make relevant subject-event fair

  14. Subject assigner – Baseline feature • Step1 • Make deepest depth NP chunk from parser tree • Step2 • Assign right foreword NP chunk to Event word • EX) Finally today, welearned that the space agency has finally takena giant leap forward. NP Result We – learned The space agency - taken NP NP3 NP NP1 Event NP2 Event

  15. Experiment result • Corpus • Manually annotated corpus based on TimeBank 1.1 Corpus • 100 sentence containing 158 number of event • Result • 82 / 158 = 52% accuracy

  16. Conclusion • So far I Implement base line System • Need to improve each component by accuracy • Each of component need to be solved different problem • Event recognizer, Subject assigner : need more suitable feature • Categorizer : how to treat the pronoun type subject Event recognizer Event recognizer Subject assigner Subject assigner Categorizer Categorizer

  17. Thanks

More Related