Controlled Vocabularies in Searching
1 / 33

Tamas Doszkocs, Ph.D. Computer Scientist doszkocs@nlm.nih - PowerPoint PPT Presentation

  • Uploaded on

Controlled Vocabularies in Searching. Tamas Doszkocs, Ph.D. Computer Scientist Definition Purpose and Role A Brief History Who is in Control? Spell Checkers. Folksonomies Tagging Search Focus Search refinement Web X.Y. Controlled Vocabularies.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Tamas Doszkocs, Ph.D. Computer Scientist doszkocs@nlm.nih' - natan

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Tamas doszkocs ph d computer scientist doszkocs nlm nih

Controlled Vocabularies in Searching

Tamas Doszkocs, Ph.D.Computer

Controlled vocabularies


Purpose and Role

A Brief History

Who is in Control?

Spell Checkers



Search Focus

Search refinement

Web X.Y

Controlled Vocabularies

Related topics that we won t talk about
Related Topics(that we won’t talk about)

Definition and purpose
Definition and Purpose

  • Controlled vocabulary is a list of terms that have been enumerated explicitly.

  • In Library and Information Science Controlled vocabulary is a carefully selected list of words and phrases, which are used to tag units of information so that they may be more easily retrieved by a search. The terms are chosen and organized by trained professionals (including librarians and information scientists) who possess expertise in the subject area. Controlled vocabulary terms can accurately describe what a given document is actually about, even if the terms themselves do not occur within the document's text. Fully developed controlled vocabulary systems, such as the Library of Congress Subject Headings, are often published in a reference work that is called a thesaurus. Controlled vocabularies form part of a larger universe of nomenclatural approaches to data classification called metadata. (Wikipedia)

More information
More Information

  • Bridging the gap between languages used by authors, search systems and users:






A brief history
A Brief History

  • The 1970’s and 1980’s: bloody battles and casualties

    • Controlled vocabularies vs. natural language

    • Command languages vs. free-form queries

    • CVs vs. abstracts vs. full text

    • Librarians vs. end users

  • The 1990’s and the Web: natural language for the masses

  • The 21st Century: the best of both worlds

Vocabulary control for information retrieval 1972
Vocabulary Control for Information Retrieval, 1972

  • by F. Wilfrid Lancaster

  • About this title: Contents- * Why Vocabulary Control? * Pre-coordinate & Post-coordinate Systems * Vocabulary Structure & Display * Gathering the Raw Material * Standards & Guidelines * Organization of Terms: The Hierarchical Relationship * Organization of Terms: The Associative Relationship * Terms: Form & Compounding * The Entry Vocabulary * Homography & Scope Notes * Thesaurus Display * Vocabulary Growth Updating * The Role of the Computer * Identifiers & Checklists * The Influences of Vocabulary on the Performance of a Retrieval System * Evaluation of Thesauri * Natural-language Searching & the Post-controlled Vocabulary * Hybrid Systems * Compatibility & Convertibility * Multilingual Aspects * Automatic Approaches to Thesaurus Construction * Some Cost-effectiveness Aspects of Vocabulary Control * Bibliography * Index. "The publisher's announcement claims that the original edition is an information science classic that has emerged as the 'bible' of indexing & retrieval vocabularies, & (is the) first definitive monograph devoted exclusively to controlled vocabularies in information retrieval. ..

An associative interactive dictionary for online searching 1978
An Associative Interactive Dictionary for Online Searching, 1978

  • Title: AID, an Associative Interactive Dictionary for Online Searching.

  • Authors: Doszkocs, Tamas E.

  • Descriptors:

  • Dictionaries - Information Retrieval - Online Systems - Search Strategies -  Tables (Data) - Word Frequency

  • Source: On-Line Review, v2 n2 p163-73 Jun 1978, Jun78

  • AID meta-searched MEDLINE, TOXLINE and the Hepatitis Databank and displayed result clusters of keywords and MeSH headings

Cite 1979
CITE, 1979 1978

  • Doszkocs T. E., Rapp B. A. Searching Medline in English: A prototype user interface with natural language query, ranked output and relevance feedback. Proc. ASlS Annu. Meet. Vol 16 pp 131-137 1979.

  • Automatic suggestion of Medical Subject Headings

  • Used as NLM’s OPAC 1979-1984

Webline 1994
WebLine, 1994 1978

  • The first Web interface to an online retrieval system

    • Associative Concept Navigation in MEDLINE and other NLM Databases via a Mosaic - Forms - WWW Interface Combining Natural Language Processing, Expert Systems and (un)Conventional Information Retrieval Techniques; Tamas E. Doszkocs, Seth B. Widoff, Bruno M. VastaNational Library of Medicinein Proceedings of the Second World Wide Web Conference , Chicago 1994


  • see also WebCrawler (Brian Pinkerton)

  • The Open Web and the Hidden Web

Jerry s guide to the web 1994
Jerry’s Guide to the Web, 1994 1978

  • Jerry Yang and David Filo’s Yahoo! 1995

    • a directory of web sites, organized in a hierarchy of subject descriptors

    • Librarians at Yahoo

      • Surfing is to Yahoo! what the Dewey Decimal System is to libraries. In other words, Surfing is the categorization of websites. It also happens to be how Yahoo! began. Today our Surfing team continues its passion for finding, evaluating, and organizing information on the Internet. They have a voracious appetite for learning about new topics. They are curious individuals who are skilled at intuitively and efficiently analyzing and classifying diverse, unstructured pieces of information across the Yahoo! network. Surfers are critical to the relevance and intuitive nature of information presented on Yahoo!.

Folksonomies and social tagging
Folksonomies and 1978Social Tagging

Clustering with multiple criteria
Clustering with Multiple Criteria Controlled Vocabularies

Analyzing search results
Analyzing Search Results Controlled Vocabularies

Visualizing search results
Visualizing Search results Controlled Vocabularies

Multi faceted clustering in an opac
Multi-faceted Clustering in an OPAC Controlled Vocabularies

Allplus web 2 0 content mashup
AllPlus Web 2.0 Content Mashup Controlled Vocabularies

Allplus dynamic cluster visualization
AllPlus Dynamic Cluster Visualization Controlled Vocabularies

Tamas doszkocs ph d computer scientist doszkocs nlm nih

Controlled Vocabularies in Searching Controlled Vocabularies

Tamas Doszkocs, Ph.D.Computer