1 / 42

HowNet and Computation of Meaning

HowNet and Computation of Meaning. Zhendong Dong dzd@keenage.com WWW.keenage.com GWC-06 Jeju, Korea 2006-01-22. Outlines. Bird’s-eye view of HowNet Prominent features. Bird’s-eye view of HowNet. What is HowNet? History of HowNet Statistics on latest version

kaspar
Download Presentation

HowNet and Computation of Meaning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com WWW.keenage.com GWC-06 Jeju, Korea 2006-01-22

  2. Outlines • Bird’s-eye view of HowNet • Prominent features

  3. Bird’s-eye view of HowNet • What is HowNet? • History of HowNet • Statistics on latest version • Composition of HowNet

  4. What is HowNet? • HowNet is an on-line extralinguistic knowledge system for the computation of meaning in HLT. • HowNet unveils inter-concept relations and inter-attribute relations of the concepts as connoted in its Chinese-English lexicon.

  5. History of HowNet 1988 Basic research started 1999 1st version released 2000 Revision of KDML started 2002 New version released

  6. Statistics - general Chinese word & expression 84102 English word & expression 80250 Chinese meaning 98530 English meaning 100071 Definition 25295 Record 161743

  7. A record in HowNet dictionary NO.=076856 W_C=买主 G_C=N [mai3 zhu3] E_C= W_E=buyer G_E=N E_E= DEF={human|人:domain={commerce|商业},{buy|买: agent={~}}}

  8. Statistics - semantic Chinese English Thing 58153 58096 Component 7025 7023 Time 2238 2244 Space 1071 1071 Attribute 3776 4045 Atttibute-value 9089 8478 Event 12634 10076

  9. Statistics – main syntactic categories ChineseEnglish ADJ 11705 9576 ADV 1516 2084 VERB 25929 21017 NOUN 46867 48342 PRON 112 71 NUM 225 242 PREP 128 113 AUX 77 49 CLA 424 0

  10. Statistics – part of relations Chinese synset: Set = 13463 Word Form = 54312 antonym: Set = 12777 converse: Set = 6753 English synset: Set = 18575 Word Form = 58488 antonym: Set = 12032 converse: Set = 6442

  11. Composition • Database • Tools for computation of meaning

  12. Database • Dictionary • Taxonomies • Axiomatic relations & role shifting

  13. Dictionary

  14. Taxonomies - 10 • Entity • Event • Attribute • AttributeValue • Secondary features • Event roles • Typical actors of event roles • Event relations and role shifting • Antonymous sememe pairs • Converse sememe pairs

  15. Tools for computation of meaning • Browser • Secondary resources

  16. Prominent features • All syntactic classes of words included • Sememes and semantic roles • Defining concepts in KDML on the basis of sememes and semantic roles • Relations – the soul of HowNet • Relations obtained by computing rather than manually-coding • Identical representation in various linguistic structures

  17. Sememes Sememes 2099 Entity 151 thing (physical, mental, fact) component (part, fitting) time space (direction, location) Event (relation, state; action) 812 Attribute 247 AttributeValue 889 Secondary feature121

  18. Semantic roles 91 (1) Main semantic roles (a) principal semantic roles: 6 (b) affected semantic roles: 11 (2) peripheral semantic roles (a) time: 12 (f) basis: 6 (b) space: 11 (g) comparison: 2 (c) resultant: 8 (h) coordination: 6 (d) manner: 11 (i) commentary: 2 (e) modifier: 16

  19. Defining concepts (1) W_E=doctor G_E=V DEF={doctor|医治} W_E=doctor G_E=N DEF={human|人:HostOf={Occupation|职位},domain={medical|医}, {doctor|医治:agent={~}}} W_E=doctor G_E=N E_E= DEF={human|人:{own|有:possession={Status|身分: domain={education|教育},modifier={HighRank|高等: degree={most|最}}},possessor={~}}}

  20. Defining concepts (2) W_E=buy G_E=V DEF={buy|买} cf. (WordNet) obtain by purchase; acquire by means of finacial transaction W_E=buy G_E=V DEF={GiveAsGift|赠:manner={guilty|有罪}, purpose={entice|勾引}} cf. (WordNet) make illegal payments to in exchange for favors or influence

  21. Relations – the soul of HowNet • Meaning is represented by relations • Computation of meaning is based on relations

  22. 1. Event Frame ~ Verb frame - {event|事件} ├ {static|静态} {event|事件} │ ├ {relation|关系} {static|静态} │ │ ├ {possession|领属关系} {relation|关系} │ │ │ ├ {own|有} {possession|领属关系:possessor={*},possession={*}} │ │ │ │ ├ {obtain|得到} {own|有:possessor={*},possession={*},source={*}} └ {act|行动} {event|事件:agent={*}} ├ {ActGeneral|泛动} {act|行动:agent={*}} └ {ActSpecific|实动} {act|行动:agent={*}} └ {AlterSpecific|实变} {ActSpecific|实动:agent={*}} ├ {AlterRelation|变关系} {AlterSpecific|实变:agent={*}} │ ├ {AlterPossession|变领属} {AlterRelation|变关系:agent={*},possession={*}} │ │ ├ {take|取}{AlterPossession|变领属:agent={*},possession={*},source={*}} │ │ │ ├ {buy|买} {take|取:agent={*}, possession={*}, source={*}, cost={*}, beneficiary={*}

  23. 2. Typical actors of event roles ~ VerbNet │ ├ {buy|买} {take|取:agent={human|人}{group|群体->}, possession={artifact|人工物->}, source={human|人}{InstitutePlace|场所}, cost={money|货币}, beneficiary={human|人}{group|群体->}, domain={economy|经济}}

  24. Axiomatic Relations & Role Shifting - 1 {buy|买} <----> {obtain|得到} [consequence]; agent OF {buy|买}=possessor OF {obtain|得到}; possession OF {buy|买}=possession OF {obtain|得到}. {buy|买} <----> {obtain|得到} [consequence]; beneficiary OF {buy|买}=possessor OF {obtain|得到}; possession OF {buy|买}=possession OF {obtain|得到}. {buy|买} <----> {obtain|得到} [consequence]; source OF {buy|买}=source OF {obtain|得到}; possession OF {buy|买}=possession OF {obtain|得到}.

  25. Axiomatic Relations & Role Shifting - 2 {buy|买} [entailment] <----> {choose|选择}; agent OF {buy|买}=agent OF {choose|选择}; possession OF {buy|买}=content OF {choose|选择}; source OF {buy|买}=location OF {choose|选择}. {buy|买} [entailment] <----> {pay|付}; agent OF {buy|买}=agent OF {pay|付}; cost OF {buy|买}=possession OF {pay|付}; source OF {buy|买}=taget OF {pay|付}.

  26. Axiomatic Relations & Role Shifting - 3 {buy|买} (X) <----> {sell|卖} (Y) [mutual implication]; agent OF {buy|买}=target OF {sell|卖}; source OF {buy|买}=agent OF {sell|卖}; possession OF {buy|买}=possession OF {sell|卖}; cost OF {buy|买}=cost OF {sell|卖}.

  27. Identical representation - 1 W_E=smuggle G_E=V DEF={transport|运送:manner={guilty|有罪}} W_E=drug G_E=N DEF={addictive|嗜好物:modifier={guilty|有罪}}

  28. Identical representation - 2 W_E=smuggling of drugs G_E=N DEF={fact|事情:CoEvent={transport|运送: manner={guilty|有罪},patient={addictive|嗜好物: modifier={guilty|有罪}}}} W_E=drug smuggler G_E=N DEF={community|团体:{transport|运送:agent={~}, manner={unlawful|非法},patient={addictive|嗜好物}, purpose={sell|卖}}}

  29. Types of relations

  30. Motivation to develop secondary resources • To check from different angles HowNet knowledge data for their preciseness and consistency • To provide users with tools for application • Practible for any sense of any word

  31. Secondary resources • Concept Relevance Calculator (CRC) • Concept Similarity Measure (CSM) • Query Expansion Tool (QET) • Chinese Morphological Processor (CMP) • Chinese Message Analyzer (CMA)

  32. Concept similarity doctor 2 <> dentist 0.300000 doctor 1<> dentist 0.883333 doctor 1<> nurse1 0.620000 doctor 1<> nurse2 0.454545 doctor 1<> patient 0.203636 walk <> run 0.144444 walk <> jump 0.144444 walk <> swim 0.130159 walk <> fly 0.124444 walk <> buy 0.018605

  33. Conclusion • Extralinguistic knowledge is indispensable for HLT • The knowledge should be a system which is computer-oriented • It should be big enough, exemplary toy is useless • It can conduct computation of meaning

  34. Thank youWelcome towww.keenage.com!Download and try Mini-HowNet

More Related