1 / 21

Not just numbers on shelves: using the DDC for information retrieval

Not just numbers on shelves: using the DDC for information retrieval. Gordon Dunsire Presented at the Symposium “Bridging the class( ification ) divide: the new DDC languages and retrieval possibilities ”, 27 April 2010, Bibliotheca Alexandrina, Alexandria, Egypt. Overview.

feryal
Download Presentation

Not just numbers on shelves: using the DDC for information retrieval

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Not just numbers on shelves: using the DDCfor information retrieval Gordon Dunsire Presented at the Symposium “Bridging the class(ification) divide: the new DDC languages and retrieval possibilities”, 27 April 2010, Bibliotheca Alexandrina, Alexandria, Egypt

  2. Overview • “Traditional” uses of the DDC • Machine-readability opens up possibilities for subject-based information retrieval • Hierarchical and linear browse • Keyword search • Terminology services (hub-spoke) • Multilingual retrieval • Semantic web • EDUG IT survey

  3. Traditional use of the DDC • Shelfmarking • Shelf location in a linear sequence • Notation can be fitted to a (book) spine • Subject grouping • Notation brings similar topics together and keeps separate topics apart • Collection analysis by subject or discipline • Management information by subject • Loans, acquisitions, etc.

  4. Digital environment • Notation <> Captions • Notation in catalogue record can be (automatically) matched to human-friendly caption(s) • Opposite of classification process, where caption is matched to notation • Sometimes via Relative Index • Length of caption not a limiting factor • Length of notation also not limiting • No need to truncate notation • Notation/caption changes (legacy) more easily managed

  5. Information retrieval • Notation hierarchy can be used to display caption hierarchy • Built notation (i.e. added subdivisions) can be parsed to identify facet captions • E.g. Place, time • Keywords can be found inside captions • Notation can be linked to caption variants • Translations of the DDC • “Captions” or subject headings outside of the schedules

  6. Linear browse • Captions listed in alphabetical order • With or without Relative Index • Already in alphabetical order • Possibility of keyword-in-context (KWIC) or keyword-out-of-context (KWOC) indexes • Each significant word in caption rotated to the front (or extracted) and interfiled in alphabetical order • Possibility of integration with subject headings • Or substitute for headings

  7. Hierarchical browse • Captions and/or notations exposed at one “level” only • Controlled by numeric notation • First digit = level 1; First 2 digits = level 2, etc. • Decimal notation so maximum of 10 topics at each level • User drills-down in hierarchical order from the top (broadest topic) • Or drills-up from specific to general • Levels can be expressed as tag clouds • Topics weighted by notation (3xx, 32x, 321 ...)

  8. Keyword retrieval • Captions included in: • DDC keyword index • Subject keyword index • E.g. With subject headings • General keyword index • E.g. With titles, notes, etc. • DDC caption terminology distinct from other major subject heading schemes • Alternative terms (and spellings) • DDC caption: “Acquisition through exchange, gift, deposit” • LCSH: “Book donations” [neither term in Relative Index]

  9. Terminology services (1) • Captions, headings, terms from any scheme can be “classified” by DDC • i.e. Assigned a DDC notation • Notation becomes a bridge or link between headings from different schemes • Hub-and-spoke, with DDC as the hub and each different scheme as a spoke • More efficient that one-one mappings between headings • Combinatorial explosion • 3 schemes > 3 mappings • 4 schemes > 6 mappings ...

  10. Terminology services (2) • Hub (i.e. DDC notation) is transparent to user • Term A > DDC notation < Term B • Term A <> Term B • Approach used by High-Level Thesaurus (HILT) project • Successful, but scalability an issue • Even though more efficient that Term-Term approach • Scalability might be more achievable in a distributed environment • i.e. Semantic Web

  11. Translations • Caption to caption translation • English caption <> Arabic caption • But notation is common, and language-free • Non-English translation is similar to non-DDC topic/subject heading scheme • Intrinsic hub-spoke architecture • Arabic caption <> English caption (= notation) <> German caption • Arabic caption <> German caption • Translations can be automatically switched • “Instance” notation remains the same

  12. DDC and the Semantic Web • OCLC is developing a representation of the DDC in resource description framework (RDF) • The basis of the semantic web • http://dewey.info • Includes notations, captions, notes, and legacy (audited changes) • Only DDC Summaries available so far • 11 languages including English • Can be added to the linked-data “soup” • Distributed processing, development and services

  13. Survey results to 20 Apr 2010

  14. Thank you • G.dunsire@strath.ac.uk • EDUG IT (links to applications) • http://www.slainte.org.uk/edugit/ • Dewey.info (DDC in RDF) • http://dewey.info/

More Related