1 / 12

Extracting Semantic Relationships between Wikipedia Categories

Extracting Semantic Relationships between Wikipedia Categories. By Sergey Chernov , Tereza Iofciu, Wolfgang Nejdl, Xuan Zhou, Michal Kopycki, Przemyslaw Rys. MOTIVATION. Preliminaries. WIKIPEDIA: largest knowledge sharing system Many pages assigned to CATEGORIES

nani
Download Presentation

Extracting Semantic Relationships between Wikipedia Categories

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Extracting Semantic Relationships between Wikipedia Categories By Sergey Chernov, Tereza Iofciu, Wolfgang Nejdl,Xuan Zhou, Michal Kopycki, Przemyslaw Rys Sergey Chernov

  2. MOTIVATION Preliminaries • WIKIPEDIA: largest knowledge sharing system • Many pages assigned to CATEGORIES • All links are NAVIGATIONAL • Can we extract SEMANTIC links? Sergey Chernov

  3. MOTIVATION Wikipedia Categories Example Sergey Chernov

  4. MOTIVATION Possible benefits • Semi-structured queries • “find Countries which had Democratic Non-Violent Revolutions” rephrased as • “find page from category Countries which is connected to some page in Non-Violent Revolutions” • Hints for authors • “you edit page from category Countries,do you want to add a link to page in category Capital?” • Raw data for manual semantic markup Sergey Chernov

  5. Experiments Heuristics • Number of links • NL = 3 • Connectivity Ratio • CR = 3/4 = 0.75 Countries Capitals Germany Berlin Austria Vienna Denmark Stockholm France Paris Sergey Chernov

  6. Experiments Dataset • INEX 2006 collection • Sample category rankings Sergey Chernov

  7. Manual assessment methodology • Semantic Connection Strength (SCS) Measure: • 2 = strong semantic relationship, • 1 = average semantic relationship, • 0 = weak or no semantic relationship. • Instruction for Assessors • “category A is strongly related to category B (value 2) if you believe that every page in A should conceptually have at least one semantic link to B;” • “A and B are averagely related (value 1), if you believe 50% of pages in A should have semantic links to B;” • “otherwise, A and B are weakly related (value 0).” Sergey Chernov

  8. Experiments Experiments with Number of Links Average semantic connections strength for 100 sample categories, extracted using Number of Links. Sergey Chernov

  9. Experiments Experiments with Connectivity Ratio Average semantic connections strength for 100 sample categories, extracted using Connectivity Ratio. Sergey Chernov

  10. Summary General Results and Conclusions • Result is skewed toward Countries category • Connectivity Ratio is a better measure than Number of Links • We have observed that inlinks have better performance than outlinks. Sergey Chernov

  11. Summary Future Steps • More manual exploration, look for additional heuristics • Consider more categories • SCS composed of • Is this a “part of” relation? W1 • Is this a “is a” relation? W2 • Is this a “synonym” relation? W3 • Is this a “antonym” relation? W4 • It is related in a different way? Which one? W5 Sergey Chernov

  12. Thank You! Sergey Chernov

More Related