1 / 24

Betsy L. Humphreys Associate Director for Library Operations NLM, NIH, HHS

CENDI Staff Workshop Knowledge Organization Systems: Current and Future Uses September 16, 2004. National Library of Medicine. Betsy L. Humphreys Associate Director for Library Operations NLM, NIH, HHS blh@nlm.nih.gov. NLM “Knowledge Organization Systems”.

bwaddell
Download Presentation

Betsy L. Humphreys Associate Director for Library Operations NLM, NIH, HHS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CENDI Staff Workshop Knowledge Organization Systems: Current and Future Uses September 16, 2004 National Library of Medicine Betsy L. Humphreys Associate Director for Library Operations NLM, NIH, HHS blh@nlm.nih.gov

  2. NLM “Knowledge Organization Systems” • Name and Series/Journal Authority Files • Library Materials Classification • Individual Controlled Vocabularies • MeSH, MedlinePlus Health Topics, NCBI Taxonomy, RxNorm clinical drug vocabulary • Unified Medical Language System (UMLS) Knowledge Sources • Metathesaurus – many vocabularies in a common, integrated format • Semantic Network • Lexicon • Associated tools

  3. NLM “Knowledge Organization Systems” • Common Characteristics • Searchable on the Web, often interlinked with other NLM resources • Distributed in one or more electronic formats • Used within NLM for: • Information retrieval and display • Data creation • Natural language interpretation • Heavily used outside NLM for wide range of applications • Most built and maintained with custom systems

  4. http://wwwcf.nlm.nih.gov/class/

  5. Medical Subject Headings (MeSH) • Structure of MeSH upgraded in 2000 • Descriptor Class – closely related concepts grouped to enhance retrieval • Concept – distinct meaning • Term – concept name http://www.nlm.nih.gov/mesh/meshrels.html

  6. Known Translations of MeSH • In UMLS - Dutch, Finnish, French, German, Italian, Japanese, Portuguese, Russian, Spanish, Swedish • Other Complete Translations • Arabic, Chinese, Czech, Greek, Thai, Turkish • In Progress or Planned or Hoped For • Korean, Slovenian, Vietnamese, Lithuanian, Polish, Slovakian, Norwegian, Kiswahili

  7. Coordinating Translations How? Single Database - Web Interface Add Language as a Term Property Translated Terms added to Concept Non-English Concepts added to Descriptor

  8. Status of Use • Current Active Groups • German, French, Italian, Vietnamese • Groups Beginning Work with MTMS • Dutch, Finnish, Japanese, Polish, Slovakian • Groups Starting Soon • Czech, Portuguese, Korean, Norwegian, Russian, Spanish

  9. http://www.ncbi.nlm.nih.gov/Taxonomy/

  10. http://umlsinfo.nlm.nih.gov

  11. The UMLS in practice • Database • Series of relational files • Interfaces • Web interface: Knowledge Source Server (UMLSKS) • Application programming interfaces(Java and XML-based) • Applications • lvg (lexical programs) • MetamorphoSys (installation and customization) • SOON: Metathesaurus browser The UMLS is not an end-user application

  12. UMLS 3 components • Metathesaurus • Concepts • Inter-concept relationships • Semantic Network • Semantic types • Semantic network relationships • Lexical resources • SPECIALIST Lexicon • Lexical tools

  13. Metathesaurus Source Vocabularies (2004AB) • 134 source vocabularies • 126 contributing concept names • 73 families of vocabularies • multiple translations (e.g., MeSH, ICPC, ICD-10) • variants (American-English equivalents, Australian extension/adaptation) • subsequent editions usually considered distinct families (ICD: 9-10; DSM: IIIR-IV) • Broad coverage of biomedicine • Common presentation

  14. L0000002 A0000005Cephalgia(source 1) S0000003 Metathesaurus Concepts (2004AB) • Concept (> 1M) CUI • Set of synonymousconcept names • Term (> 3.8 M) LUI • Set of normalized names • String (> 4.3M) SUI • Distinct concept name • Atom (> 5.1M) AUI • Concept namein a given source C0000001 L0000001 A0000001headache(source 1) A0000002 headache(source 2) S0000001 A0000003 Headache(source 1) A0000004 Headache(source 2) S0000002

  15. Metathesaurus Relationships • Symbolic relations: ~9 M pairs of concepts • Statistical relations : ~7 M pairs of concepts (co-occurring concepts) • Mapping relations: 100,000 pairs of concepts • Categorization: Relationships between concepts and semantic types from the Semantic Network

  16. Why you might care about the UMLS • Content with applicability outside of biomedicine • Tools generally useful in NLP, datamining • New Metathesaurus Rich Release Format • Potentially useful as format for distribution of any set of vocabularies/ontologies and for robust purpose-specific mappings between such systems • May well lead to development of a variety of tools that can output or ingest the format

More Related