1 / 25

Semantic Smoothing for Text Clustering

Semantic Smoothing for Text Clustering. Presenter : Bei -YI Jiang Authors : Jamal A. Nasir , Iraklis Varlamis , Asim Karim , George Tsatsaronis 2013. Knowledge-Based Systems. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. Motivation.

leane
Download Presentation

Semantic Smoothing for Text Clustering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Semantic Smoothing for Text Clustering Presenter : Bei-YI JiangAuthors : Jamal A. Nasir, IraklisVarlamis, AsimKarim, George Tsatsaronis2013. Knowledge-Based Systems

  2. Outlines • Motivation • Objectives • Methodology • Experiments • Conclusions • Comments

  3. Motivation • (VSM) It assumes independency between the vocabulary terms and ignores all the conceptual relations between terms that potentially exist.

  4. Objectives • To increase the importance of core words by considering the terms’ relations, and in parallel downsize the contribution of general terms, leading to better text clustering results.

  5. Methodology 1.1 The Vector Space Model(VSM) 1.2 The Generalized Vector Space Model(GVSM) Document representations 1. 2.1 Omiotis 2.2 Wikipedia-based relatedness 2.3 Average of Omiotis and Wikipedia-based relatedness 2.4 Pointwise mutual information Relatedness measure 2. 3.1 Clustering algorithms 3.2 Algorithms complexity 3.3 Clustering criterion functions Document clustering 3. A GVSM-based semantic kernel S-VSM 4. Top-k S-VSM 5.

  6. Methodology • The Vector Space Model(VSM)

  7. Methodology • The Generalized Vector Space Model(GVSM)

  8. Methodology • Omiotis • Wikipedia-based relatedness • Average of Omiotis and Wikipedia-based relatedness

  9. Methodology • Pointwise mutual information

  10. Methodology

  11. Methodology

  12. Methodology

  13. Methodology

  14. Experiments

  15. Experiments • Vector similarity • Evaluation measures

  16. Experiments • Evaluation measures • Purity • Entropy • Error rate

  17. Experiments

  18. Experiments

  19. Experiments

  20. Experiments

  21. Experiments

  22. Experiments

  23. Experiments

  24. Conclusions • The evaluation results demonstrated that S-VSM dominates VSM in performance in most of the combinations and compares favorably to GVSM. • In order to further reduce the complexity of S-VSM we introduced an extension of it, namely the top-k S-VSM.

  25. Comments • Advantages • It offers a very flexible kernel that can be applied within any domain or with any language. • The ability of the S-VSM perform much better than the VSM in the task of text clustering. • It very efficiently in terms of time and space complexity • Applications • Text clustering • Semantic smoothing kernels

More Related