1 / 20

Computing semantic relatedness using Wikipedia features

Computing semantic relatedness using Wikipedia features. Presenter : YAN-SHOU SIE Authors Mohamed Ali Hadj Taieb * , Mohamed Ben Aouicha , Abdelmajid Ben Hamadou 2013. KBS. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. Motivation.

azuka
Download Presentation

Computing semantic relatedness using Wikipedia features

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computing semantic relatedness using Wikipedia features Presenter : YAN-SHOU SIE Authors Mohamed Ali HadjTaieb*, Mohamed Ben Aouicha, AbdelmajidBen Hamadou2013. KBS

  2. Outlines • Motivation • Objectives • Methodology • Experiments • Conclusions • Comments

  3. Motivation • Measuring semantic relatedness is a critical task in many domains such as psychology, biology, linguis- tics, cognitive science and artificialintelligence.

  4. Objectives • We propose a novel system for computing semantic relatedness between words. Recent approaches have exploited Wikipedia as a huge semantic resource that showed good performances.

  5. Methodology • Our semantic relatedness computing system • Filtering Wikipedia category graph • pre-processing • Filtering article content • Porter stemming • Weighting article stems • Providing a Category Semantic Depiction (CSD)

  6. Methodology • Different steps performed to generate the Category Semantic DepictionFilteringWikipedia category graph

  7. Methodology • Filtering Wikipedia category graph • First : clean meta-categories • We remove all those nodes whose labels contain any of the following strings : Wikipedia, wikiproject, lists, mediawiki,template, user, portal, categories, articles, pages, stub and album • Second : remove orphan nodes and we keep only the category Contents as root • maximum depth 291 to 221

  8. Methodology • pre-processing • Filtering article content • Remove html tags,infobox, language translation, hyperlinks. . . • Porter stemming • filtereda stop list to eliminate words which do not have any contribution. • Weighting article stems • Providing a Category Semantic Depiction (CSD)

  9. Methodology- • Semantic relatedness computing system architecture • Extraction categories algorithm • WordNet: • resolve the disambiguation pages problem: • Setp1 : extracting all outLinks • Setp2 : find links containing disambiguation tag in parenthesis • Setp3 : extract categories to the two first links • Final : take the categories of the article assigned to the first link existing in the ordered set

  10. Methodology • Semantic relatedness computing system architecture • Semantic relatedness computing

  11. Methodology • Evaluating semantic relatedness measures • Comparison with human judgments • Pearson product-moment correlation coefficient • Spearman rank order correlation coefficient • Datasets

  12. Experiments • Our semantic relatedness computing system modules using Wikipedia features • Basic system • First module • Second module • Third module • Forth module

  13. Experiments • Basic system

  14. Experiments • First module: simple patterns

  15. Experiments • Second module: Wikipedia pages

  16. Experiments • Third module: enrichment using categories neighbors in WCG

  17. Experiments • Forth module: Categories enrichment using WCG and redirects

  18. Experiments • Application of the SR measure on other datasets • Datasets RG-65 and MC-30 • The verbal dataset YP-130 • Solving word choice problems

  19. Conclusions • Our result system shows a good performance and outperforms sometimes ESA (Explicit Semantic Analysis) and TSA (Temporal Semantic Analysis) approaches

  20. Comments • Advantages Able to use wiki to get a lot of semantic relationship information, semantic relations for many measurements related work of great help. • Applications • cognitive science • artificial intelligence

More Related