1 / 14

Multilayer SOM With Tree-Structured Data for Efficient Document Retrieval and Plagiarism Detection

This paper presents a novel approach using Multilayer SOM for efficient document retrieval and plagiarism detection by incorporating tree-structured data. The method enhances retrieval accuracy by combining global and local characteristics, showing promising results. MLSOM serves as a practical computational solution, offering simplicity and effectiveness. However, the rate of failed plagiarism detection remains a drawback. Overall, this innovative application demonstrates the potential of MLSOM in text analysis.

bschnell
Download Presentation

Multilayer SOM With Tree-Structured Data for Efficient Document Retrieval and Plagiarism Detection

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multilayer SOM With Tree-Structured Data for Efficient Document Retrieval and Plagiarism Detection Presenter : Cheng-Feng Weng Authors :Tommy W. S. Chow, M. K. M. Rahman 2009/10/12 TNN.18 (2009)

  2. Outline • Motivation • Objective • Method • Experiments • Conclusion • Comments

  3. Motivation Science ……. Computer ……. School …….. School of Computer Science …….. • Document Retrieval: • Term-Frequency Problem • Two doc. Containing similar term frequencies may be of different contextually when it spatial distribution of terms is very different. • Plagiarism Detective: • Paraphrasing Problem SOM …project…….. SOM …be mapped into……..

  4. Objective • It proposed a tree-structured document model with MLSOM for DR and PD. Global View Document ……. Tree-Structured Model DR Local View PD MLSOM

  5. Structured Representation of DF • A document is partitioned into pages that are further partitioned into paragraphs. Page 我是網頁 我是網頁 第一行 第二行 無言的第三行 我是網頁 第一行 第一行 第二行 <HTML> <HEAD> </HEAD> <BODY> 我是網頁<br> <p>第一行</p> <p>第二行</p> 無言的第三行 </BODY> </HTML> 無言的第三行 Paragraph 我是網頁

  6. Structured Representation of DF(cont.)

  7. Multilayer SOM • MLSOM was developed for handling tree-structured data.

  8. Multilayer SOM (cont.) • Similarity:

  9. Related Docs. MLSOM Retrieval Document Extract to tree-structure and project with PCA matrix Trained MLSOM

  10. Plagiarism Detective • Plagiarism Detective using Local Association (PDLA) Layer 3 SOM Related Docs. D1, D2, … D3, D4, …. D2, D6, … …

  11. Experiments • Document Retrieval:

  12. Experiments (cont.) • Plagiarism Detective:

  13. Conclusions • A new approach of DR and PD using tree-structured document representation and MLSOM is proposed. • It has shown that tree-structured representation enhances the retrieval accuracy by incorporating local characteristics with traditional global characteristics. • Computational Issue: • The MLSOM serves as an efficient computational solution for practical implementation.

  14. Comments • Advantage • Practical, Simple but efficient and effective • Drawback • Rate of fail plagiarism detective is still high • Application • …

More Related