1 / 12

Central Registry for Digitized Objects: Linking Production and Bibliographic Control

Central Registry for Digitized Objects: Linking Production and Bibliographic Control. Ralf Stockmann Göttinger Digitization Center. As things are now. Huge ventures in Digitization Google Microsoft National programs Local centers Accessibility World Digital Library

Download Presentation

Central Registry for Digitized Objects: Linking Production and Bibliographic Control

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Central Registryfor Digitized Objects:Linking Production andBibliographic Control Ralf StockmannGöttinger Digitization Center

  2. As things are now • Huge ventures in • Digitization • Google • Microsoft • National programs • Local centers • Accessibility • World Digital Library • European Digital Library • National portals • Google Book Search

  3. As things are now • We just face the dawn of mass digitization • Leaving behind the state ofmanufacturing • Entering industrialization • Scanning Robots • Accessible Full Text (OCR)

  4. Lack of … • Coordination in digitization activities • Who scans what where when in which quality and how will it be accessible • How is “quality” defined? • Do we agree on “what”?

  5. Facing the Consequences TechnicalImprovements Costs Waste of Ressources Costs / Value AdditionalBenefit Number of digitized items per volume

  6. The Solution • Central registry for digitized objects • Focused on the production context (no user frontend) • API driven • Application Programming Interface • Query / Ingest • Simple implementation into existing workflow-tools • Batch mode (lists) • Open Source / free service • Matching on volume level • Score / probability

  7. Implementation Backend Services EROMM / EDL / OCLC / … Registry / Meta Data Store Aggregator / Normalizer / Mapping API Query Ingest Ingest Ingest Collections / Projects ? ? ? ! ! ! Notice of Intent Running Project Present Collections

  8. Metadata Store • Bibliographic • Title • Author • Date • Place of publication • Number of Pages (?) • Language • Print / Format • Edition • Technical • Resolution • Color depth • File type / compression • Accessibility • Institution • Persistent identifier • Rights • URL • Status • Digitized • In Progress • Intended (Timeline?) • Requested? Matching / Score „what“ Additional Judging „who, where, which quality, how accesible“ Decisive Factor „when“

  9. Obstacles • (open source) Tools for automated matching / scoring? • Interface for manual comparison / decision making • Multivolume works: low rate of uniformity (near 50% of physical SUB stock before 1900) • Unicode • Transliteration tables • Random bound books • Reliable identifier • ISBN for old books? • Anticipated rate of accuracy: 50 – 70 %

  10. Appreciation of Values • The goal is NOT to build a reliable database in terms of library standards • But to prevent further waste of resources. • If we manage to archive just 50% precision, • We saved a min. 50% of founding!

  11. Work Packages • Define metadata model • Set up database • Implement mapping tools • Define API calls • Implement API • Build some connectors to popular mass digitization workflow tools (e.g. “Goobi”) • Establish ISBN workflow • Harvest existing sources • Start with a community of actual projects • Get some (!) founding • Estimated schedule plan: 6 months

  12. Thank You(stockmann@uni-goettingen.de)

More Related