1 / 8

Inhaltserschliessung ein Beispiel

Inhaltserschliessung ein Beispiel. Titel: Abstract: Titel suggeriert: Die 3 häufigsten Wörter des Abstracts: . A Bibliographic Search by Computer

Download Presentation

Inhaltserschliessung ein Beispiel

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Inhaltserschliessungein Beispiel Titel: Abstract: Titel suggeriert: Die 3 häufigsten Wörter des Abstracts: A Bibliographic Search by Computer Updating plasma-physics data was a chance to experiment with information and programs of the Technical Information Project at MIT. The computer searched for indicative words in titles of papers that shared bibliographic references and those that referred to papers that have become classics in plasma-physics. Bibliographic, Search, Computer Plasma- Physics, Information, Papers

  2. Inverse Dokumentenhäufigkeit idf N: Totalanzahl der Dokumente der Kollektion nk: Anzahl der Dokumente, die Term k enthalten idf k =log +1 Gewichtung wik mit idf: wik =tfik • idf k tfik: Häufigkeit von Term k in Dokument i

  3. pik = wo Ni: Anzahl token im Objekt i Termhäufigkeit: Vergleich mit Normtext Vergleiche relative Häufigkeit von Term k in Objekt i mit der relativen Häufigkeit von Term k in einem Normtext: = wo z.B. NNorm = 106 Relative Häufigkeit pk von k bezüglich eines Normtextes: pk=

  4. Vergleich mit Normtextein Beispiel Text: Needham, G.A.: „Advanced Integrated Circuits Packaging“, SCP and Solid State Technology, June 1965. Ni = 1515

  5. Vergleich mit Normtextein Beispiel Text: Stiles, H.E.: „The Association Factor in Information Retieval“, JACM 8, 1961 Ni = 3188

  6. Stop List Contains about 250 common words. A typical stop list starts as follows: ANYWHERE ARE AROUND AS AT BE BECAME BECAUSE BECOME BECOMES BECOMING ... ALSO ALTHOUGH ALWAYS AMONG AMONGST AN AND ANOTHER ANY ANYHOW ANYONE ANYTHING A ABOUT ACROSS AFTER AFTERWARDS AGAIN AGAINST ALL ALMOST ALONE ALONG ALREADY

  7. Wortreduktions-Algorithmen • Wörterbuchbasierte:Ergebnis: im allgemeinen linguistisch korrekter Wortstamm.z.B.: Algorithmus von Lovins • Wörterbuchunabhängige:Ergebnis: reduziertes Wort, d.h. oft Pseudo-Wortstamm, der linguistisch nicht korrekt ist.z.B.: Algorithmus von Porter

  8. Suffix List ABILITIES ABILITY ABLE ABLED ABLEDLY ABLENESS ABLER ABLES ABLING ABLINGFUL ABLINGLY ABLY ACEOUS Exerpt from a typical suffix list: ACEOUSLY ACEOUSNESS ACEOUSNESSES ACIDOUS ACIDOUSLY ACIES ACIOUSNESS ACIOUSNESSES ACITIES ACITY ACY AE AGE AGED AGER AGES AGING AGINGFUL AGINGLY AIC AICAL AICALLY AICALS AICISM AICISMS ...

More Related