1 / 38

Library of Congress Metadata Landscape

Library of Congress Metadata Landscape. Sally H. McCallum smcc@loc.gov. Content. Library of Congress perspective on Descriptive metadata now Descriptive metadata evolution Broader metadata concerns. LC metadata needs. Same problem everyone has: Many type of resources

denis
Download Presentation

Library of Congress Metadata Landscape

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Library of Congress Metadata Landscape Sally H. McCallum smcc@loc.gov

  2. Content • Library of Congress perspective on • Descriptive metadata now • Descriptive metadata evolution • Broader metadata concerns

  3. LC metadata needs • Same problem everyone has: • Many type of resources • Books, journals, maps, audio, moving image, still image, artifacts, electronic • Many possible levels of access • Collection, item, analytic, cut, etc. • Many items • 125+ million non-electronic • Cataloging for electronic resources • 3+ million digital resources • Linking to electronic resources

  4. LC service perspective • Coherence and consistency (as much as possible) • Explainable to the end user

  5. Primary access tools at LC • Online catalog contenttagging • Full level cataloging AACR MARC 21 • Minimal level AACR MARC 21 • Initial Bib. control AACR-like MARC 21 • Some collections represented by collection level records in catalog connect to finding aid tools

  6. Finding aid tools at LC • Finding aids local EAD • Various collections of manuscripts, music, photographs • SONIC catalog AACR-like MARC-like • Sound recording collections • PPOC catalog AACR MARC21 • Photograph collections • InQuery mixed internal • Digital conversion collections • Indexing and abstracting services • Serials

  7. Current LC cataloging “feeds” • Vendor records (MARC 21) • Copy from OCLC, etc. (MARC 21) • Publisher records (ONIX) • Other for special materials • Metadata in digital objects • Metadata with digital objects (future)

  8. LC Links to electronic resources • URIs or equivalent in catalog records and finding aids • handle server • OpenURL (experimentation)

  9. Questions • Is this coherent and consistent? • Is it scalable to electronic resources? • Do all resources need the same kind of treatment? • How about proliferating metadata schemas? • How do we maintain evolutionary pathway and standardization?

  10. We see content diversity • Content = the data (title, subject term, etc.) • AACR data, EAD data, DC data, ONIX data, … • Different use of content rules: Does MARC main entry = DC creator = ONIX contributor? • More types – administrative, structural, product data, rights management • Global library community convergence on AACR for descriptive metadata?

  11. We see markup diversity • Markup = data tagging • MARC 21 tags, DC tags, ONIX tags, MAB tags, UNIMARC tags • HTML tags • EAD DTD tags • (XML tag sets easy to establish) • Global library community convergence on MARC 21 and EAD?

  12. We see different structures • Structure = record “arrangement” • ISO 2709 • Microsoft Access • DTDs, Schemas • SGML, XML, HTML, ?ML family • Convergence on XML, Schemas

  13. And at LC we have • 13,000,000 MARC 21bibliographic records • in primary catalog • 5,000,000 MARC 21 name authority records • 300,000 MARC 21 subject records • 350 trained catalogers • Integrated Library System

  14. Descriptive metadata evolution • Need to take advantage of XML • Establish standard MARC 21 in an XML structure • Need simpler (but compatible) alternatives • Development of MODS • Need interoperability with different schemas • Assemble coordinated set of tools • Need continuity with current data • Provide flexible transition options

  15. MARC 21 evolution to XML

  16. MARC 21 (2709) • MARC 21 (2709) records • Highly developed semantic content • Installed base of 1000s of MARC 21 systems • Over 1,000,000,000 MARC 21 records in local and network systems • Accessible to 100s of Z39.50 clients • Thousands of librarians who “speak” MARC 21

  17. MARC 21 (2709) record (machine view) 00967cam 2200277 a 4500 001000800000005001700008008004100025020005300229040001800282050002400312082002100336100003000357245007400387260004400461300003500505440001200540500002000552650004200572651002500614 347139419990429094819.1931129s1994 wauab 001 0 eng a 93047676 a0898863872 (acid-free, recycled paper) :c$14.95 aDLCcDLCcDLC 00aGV1046.G3bG47 199400a796.6/4/09432201 aSlavinski, Nadine,d1968-10aGermany by bike :b20 tours geared for discovery /cNadine Slavinski. aSeattle, Wash. :bMountaineers,cc1994. a238 p. :bill., maps ;c22 cm. 0aBy bike aIncludes index. 0aBicycle touringzGermanyxGuidebooks.

  18. MARCXML - MARC 21 in XML • MARCXML record • XML exact equivalent of MARC (2709) record • Lossless/roundtrip conversion to/from MARC 21 record • Simple flexible XML schema, no need to change when MARC 21 changes • Presentations using XML stylesheets • Converters available from LC, open source • LC using with OAI, METS, ZING • Adopted by OAI to replace oai_marc

  19. <record xmlns="http://www.loc.gov/MARC21/slim"> <leader>00967cam 2200277 a 4500</leader> <controlfield tag="001">3471394</controlfield> <controlfield tag="005">19990429094819.1</controlfield> <controlfield tag="008">931129s1994 wauab 001 0 eng </controlfield> <datafield tag="020" ind1="" ind2=""> <subfield code="a">0898863872 (acid-free, recycled paper) :</subfield> <subfield code="c">$14.95</subfield> </datafield> <datafield tag="040" ind1="" ind2=""> <subfield code="a">DLC</subfield> <subfield code="c">DLC</subfield> <subfield code="d">DLC</subfield> </datafield> <datafield tag="050" ind1="0" ind2="0"> <subfield code="a">GV1046.G3</subfield> <subfield code="b">G47 1994</subfield> </datafield> <datafield tag="082" ind1="0" ind2="0"> <subfield code="a">796.6/4/0943</subfield> <subfield code="2">20</subfield> </datafield> <datafield tag="100" ind1="1" ind2=""> <subfield code="a">Slavinski, Nadine,</subfield> <subfield code="d">1968-</subfield> </datafield> MARC21 (2709) to MARCXML

  20. <datafield tag="245" ind1="1" ind2="0"> <subfield code="a">Germany by bike :</subfield> <subfield code="b">20 tours geared for discovery /</subfield> <subfield code="c">Nadine Slavinski.</subfield> </datafield> <datafield tag="260" ind1="" ind2=""> <subfield code="a">Seattle, Wash. :</subfield> <subfield code="b">Mountaineers,</subfield> <subfield code="c">c1994.</subfield> </datafield> <datafield tag="300" ind1="" ind2=""> <subfield code="a">238 p. :</subfield> <subfield code="b">ill., maps ;</subfield> <subfield code="c">22 cm.</subfield> </datafield> <datafield tag="440" ind1="" ind2="0"> <subfield code="a">By bike</subfield> </datafield> <datafield tag="500" ind1="" ind2=""> <subfield code="a">Includes index.</subfield> </datafield> <datafield tag="650" ind1="" ind2="0"> <subfield code="a">Bicycle touring</subfield> <subfield code="z">Germany</subfield> <subfield code="x">Guidebooks.</subfield> </datafield> </record> MARCXML record (continued)

  21. MODS • MODS • Metadata Object Description Schema – a MARC 21 companion • Simpler element set than full MARC, but MARC semantics - simplified coded data • Richer element set than DC • More compatible with MARC than others • “Friendly” schema and tagging, no coded values • Special accommodation of electronic resources

  22. MODS for electronic resources • Development • electronic resources an important target • input from several digital library projects • Xlink attribute throughout • Related item structure supports hierarchy needed for complex digital objects • Digital origin attribute • Several date types specifically for digital projects (e.g., capture) • E-resource identifiers, e.g., DOI

  23. MODS • LC uses of MODS • Describing electronic resources • AV project, web archiving • Technician input • web archiving • Incorporation with XML resources • METS projects • OAI collections • LC offers MODS, MARCXML, DC simple

  24. <mods xmlns="http://www.loc.gov/mods/"> <titleInfo><title>Germany by bike : 20 tours geared for discovery /</title></titleInfo> <name type="personal"> <namePart>Slavinski, Nadine,</namePart> <namePart type="date">1968-</namePart> <role>creator</role> </name> <typeOfResource>text</typeOfResource> <publicationInfo> <placeCode authority="marc">wau</placeCode> <place>Seattle, Wash. :</place> <publisher>Mountaineers,</publisher> <dateIssued>c1994.</dateIssued> <dateIssued encoding="marc">1994</dateIssued> <issuance>monographic</issuance> </publicationInfo> <language authority="iso639-2b">eng</language> <physicalDescription><extent>238 p. : ill., maps ; 22 cm.</extent></physicalDescription> <note type="statement of responsibility">Nadine Slavinski.</note> <note>Includes index.</note> MARCXML to MODS

  25. <subject authority="lcsh"> <topic>Bicycle touring</topic> <geographic>Germany</geographic> <topic>Guidebooks.</topic> </subject> <classification authority="lcc">GV1046.G3 G47 1994</classification> <classification authority="ddc" edition="20">796.6/4/0943</classification> <relatedItem type="series"> <titleInfo><title>By bike</title></titleInfo> </relatedItem> <identifier type="isbn">0898863872 (acid-free, recycled paper) :</identifier> <identifier type="lccn">93047676</identifier> <recordInfo> <recordContentSource>DLC</recordContentSource> <recordCreationDate encoding="marc">931129</recordCreationDate> <recordChangeDate encoding="iso8601">19990429094819.1 </recordChangeDate> <recordIdentifier>3471394</recordIdentifier> </recordInfo> </mods> MODS(continued)

  26. MARCXML and DC • DC application target – cross domain, metadata in document headers • Transformation software important to help standardize crosswalks - LC already maintains DCMARC 21 mapping • Transformation available from LC, open source • Offer items for OAI harvesting in DC (MARCXML and MODS)

  27. <rdf:Description xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/"> <dc:title>Germany by bike : 20 tours geared for discovery </dc:title> <dc:creator>Slavinski, Nadine, 1968-</dc:creator> <dc:type>text</dc:type> <dc:publisher>Seattle, Wash. : Mountaineers,</dc:publisher> <dc:date>c1994.</dc:date> <dc:language>eng</dc:language> <dc:subject>Bicycle touring</dc:subject> </rdf:Description> MARCXML to DC

  28. MARCXML from ONIX • Publisher/bookseller record to MARC (2709) via MARCXML • Complex XML format with • traditional descriptive data possibilities • potentially useful descriptive data LC does not currently have or supply • publisher/bookseller data not of current interest

  29. Snip from ONIX record <imprint><b241>02</b241><b242>Clarion Books</b242><b243>HMCo008</b243></imprint><b081>Houghton Mifflin Company</b081><b209>New York</b209><b083>US</b083><b003>20021021</b003><b087>2002</b087><measure><c093>08</c093><c094>0.0</c094><c095>lb</c095></measure><d101>A little buckaroo isturning two in this birthday book for the very young, the fifth story about the delightful holiday mice. Mischief and near disaster abound when the littlest ouses sister and brothers throw him a cowboy-themed party. Through simple rhymes and charming illustrations, readers witness the party preparations, the rrival of the guests, the opening of presents, and the blowing out of the candles, as well as the ensuing fulfillment of the little mouses fondest birthday wish: to be acowboy.</d101><mediafile><f114>04</f114><f115>05</f115> <f116>01</f116><f117>ftp://imagesro:imagesro@ftp.hmco.com/low_res/juvenile_jacket_low_res/fall_2002/0618077723.tif</f117><f122>These images may be used only to promote the Houghton Mifflin publications with which they are associated. They may be used only in their entirety without any alteration, other than to change the size of the images. The images must be accompanied by any proprietary notice included therewith.</f122>

  30. Snip from MARCXML from ONIX <datafield ind1="" ind2="" tag="260"><subfield code="a">New York</subfield><subfield code="b">Houghton Mifflin Company</subfield><subfield code="c">2002</subfield></datafield> <datafield ind1="" ind2="" tag="300"><subfield code="a">32 p.</subfield></datafield><datafield ind1="" ind2="" tag="521"><subfield code="a">Children/juvenile.</subfield></datafield><datafield ind1="1" ind2="" tag="700"><subfield code="a">Cushman, Doug</subfield><subfield code="e">illustrator</subfield></datafield><datafield ind1="4" ind2="2" tag="856"><subfield code="3">Front cover image</subfield><subfield code="u">ftp://imagesro:imagesro@ftp.hmco.com/low_res/juvenile_jacket_low_res/fall_2002/0618077723.tif</subfield><subfield code="z">These images may be used only to promote the Houghton Mifflin publications with which they are associated. They may be used only in their entirety without any alteration, other than to change the size of the images. The images must be accompanied by any proprietary notice included therewith.</subfield></datafield>

  31. MARCXML – other tools • Tagging transformations • Name instead of number tags? • Different language tags for MODS? • MARC 21 XML “full” tagging • oai_marc to MARCXML • Character set transformations • MARCXML to FRBR tool (for experimentation) • MARC record validation tool

  32. Uses of MARCXML and related tools • Standardize MARC 21 across community for XML communication and manipulation • Open MARC 21 to XML programming tools and presentation style sheets • Standardize MARC 21 for OAI harvesting • Standardize transformations to and from other standard formats (DC, ONIX, …) • Basis for evolution while maintaining standardization

  33. Broader metadata needs • LC descriptions of digitized items includes technical and rights data not appropriate for MARC • Focusing on METS - Metadata Encoding and Transmission Standard • Descriptive, administrative, and structural in one XML document

  34. Characteristics of METS • METS enables resource retrieval, object validation, preservation, rights mgt., ... • Non-proprietary; being developed by library community • (relatively) Simple; extensible; modular

  35. METS Schema

  36. METS use • LC • Moving image project • Selected digital collections • Developing a record creation utility • Others • BnF web archiving and digital preservation • OCLC web archiving • NL Wales digital collections • Harvard audio collection • Michigan State, Berkeley, etc.

  37. In summary • LC focuses on AACR, MARC 21, and EAD for primary access • New development is evolutionary • Employing XML through MARCXML • Focus for electronic documents in MODS, a MARC derivative • For broader metadata, METS and appropriate extension schema

  38. More information • www.loc.gov/marcxml • www.loc.gov/mods • www.loc.gov/marc

More Related