1 / 59

Edward A. Fox fox@vt CS DLRL Virginia Tech, Blacksburg, VA, USA

The OAI PMH (Open Archives Initiative Protocol for Metadata Harvesting) MetaScholar Initiative All-Project Meeting Atlanta, GA 6/18/2002. Edward A. Fox fox@vt.edu CS DLRL Virginia Tech, Blacksburg, VA, USA. Acknowledgements.

tiara
Download Presentation

Edward A. Fox fox@vt CS DLRL Virginia Tech, Blacksburg, VA, USA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The OAI PMH(Open Archives InitiativeProtocol forMetadata Harvesting)MetaScholar InitiativeAll-Project MeetingAtlanta, GA 6/18/2002 Edward A. Fox fox@vt.edu CS DLRL Virginia Tech, Blacksburg, VA, USA

  2. Acknowledgements • Sponsors: Mellon Foundation, SOLINET, NSF, DLF, CNI, UK’s JISC, Virginia’s CIT, … • OAI Team: Steering Committee, Technical Committee, Developers, Data Providers, Service Providers • Emory Team, Partners around Southeast • VT Colleagues: Hussein Suleman, Rohit Kelapure, Ming Luo, Ryan Richardson, Marcos Goncalves, Priya Shivakumar, Baoping Zhang, students working on term projects, …

  3. Contents • Early history • Key concepts • Examples • ODL, XOAI • OAI Tools • Technical Plan • Conclusion

  4. Open Archives Initiative OAI www.openarchives.org openarchives@openarchives.org

  5. Open Archives Initiative (OAI) • xxx@LANL, high-energy physics (Ginsparg, 1991) • CSTR + WATERS = NCSTRL (Lagoze,1994) • xxx + NCSTRL = CoRR collaboration (1998) • Universal Preprint Service protoproto, Oct. 21-22, 1999, Santa Fe – led by LANL, CNI, DLF, Mellon --> OAi • Santa Fe Convention (see Feb 2000 D-Lib Magazine article) • Archives -> Open Archives • Support unique archive identifiers • Implement metadata set(s) (DC, using XML) • Implement OA harvesting protocol • Register the archive • Build tools, layer other services: linking, searching, …

  6. OAi Philosophy • Self-archiving = submission mechanism • Long-term storage system = archive • Open interface = harvesting mechanism • Data provider + service provider • Start with “gray literature” • e-prints/pre-prints, reports, dissertations, …

  7. Began as “archives of the world unite!” OAI

  8. Open Archives (protoproto) • ArXiv & Los Alamos National Lab • CogPrints & U. Southampton • NACA & NASA (reports) • NCSTRL & Cornell U. • NDLTD & Virginia Tech • RePEc & U. Surrey • Total of around 200K records

  9. American Physical Society California Digital Library Caltech Coalition for Networked Info. Cornell University Harvard University Library of Congress Los Alamos Nat’l Lab Mellon Foundation NASA Langley Research Cntr Old Dominion University Stanford University U. of Ghent U. of Surrey U. of Southampton Vanderbilt University Virginia Tech Washington University Original Open Archives Members

  10. Contents • Early history • Key concepts • Examples • ODL, XOAI • OAI Tools • Technical Plan • Conclusion

  11. Now is a Technical Umbrella forPractical Interoperability… Metadata Harvesting Reference Libraries Museums Publishers E-PrintArchives …that can be exploited by different communities

  12. Metadata harvesting The World According to OAI Service Providers Discovery Current Awareness Preservation Data Providers

  13. OA 1 OA 2 OA 4 OA 3 OA 5 OA 6 OA 7 Aggregation throughOAI Harvesting –Black Box Perspective

  14. Theology Emory GA UGA U FL UTK AmSo Library Aggregation throughOAI Harvesting –By Organization

  15. Confederate Constitution Civil War History Oral Sports Culture AmSo Diaries Aggregation throughOAI Harvesting –By Topic

  16. Approaches to Aggregation Build By Institution Build By Discipline

  17. Types of Access Possible Build By Institution Build By Discipline Access by Year Category Personage Author Genre Query …

  18. OAI Repository Required: Protocol Set Structure URI Scheme MDO MDO MDO MDO Required: DC MDO MDO MDO MDO DO DO DO DO

  19. Metadata vs. Data • Data refers to digital objects or digital representations of objects • Metadata is information about the objects (e.g. title, author, etc.) • OAI focuses on metadata, with the implicit understanding that metadata usually contains useful links to the source digital objects

  20. Metadata: Complex to Simple MARC (>$50) Dublin Core (DC)

  21. harves ter repository supportdata repos i tory OAI protocol items harvesting data

  22. Registered URI Scheme Unique ID within archive: (syntax is archive-specific) Archive Identifier: Registered within OAI identifiers locally unique key for extracting a record from a repository oai-identifier = oai:archive-identifier:record-identifier example = oai:ncstrl:ncstrl.cornellcs/TR94-1418

  23. harvest withindate range repos i tory record record selective harvesting - datestamps

  24. S1 harvest within set repos i tory record record record selective harvesting - sets S2

  25. Summary:Protocol for Metadata Harvesting • Service Requests • Identify • ListMetadataFormats • ListSets • GetRecord • ListIdentifiers • ListRecords • Metadata Multiplicity • Date (and Time) Ranges • Resumption Tokens

  26. Harvesting vs. Federation • Competing approaches to interoperability • Federation is when services are run remotely on remote data (e.g., federated searching) • Harvesting is when data/metadata is transferred from the remote source to the destination where the services are located (e.g., union catalogues) • Federation requires more effort at each remote source but is easier for the local system and vice versa for harvesting • OAI (currently) focuses on harvesting

  27. Contents • Early history • Key concepts • Examples • ODL, XOAI • OAI Tools • Technical Plan • Conclusion

  28. Example 1: Union Collection of ETDs(Electronic Theses and Dissertations,for Networked Digital Library ofTheses and Dissertations, NDLTD)

  29. Example 1: Details

  30. referenced items & collections referenced items & collections Special Databases Portals & Clients Portals & Clients Portals & Clients NSDL Services NSDL Services Other NSDL Services NSDL Collections NSDL Collections NSDL Collections Core Services: information retrieval CI Services browsing CI Services authentication Core Services: metadata gathering CI Services personalization Core Collection- Building Services protocols CI Services discussion Core Collection- Building Services harvesting CI Services annotation Example 2: NSDL Information ArchitectureEssentially as developed by the Technical Infrastructure Workgroup User Interfaces CoreNSDL “Bus” Usage Enhancement Collection Building

  31. Example 2: CITIDEL -> NSDL • Computing and Information Technology Interactive Digital Education Library • A collection project in the National STEM (science, technolgy, engineering, and mathematics) education Digital Library – NSDL • www.nsdl.nsf.gov • www.nsdl.org

  32. Example 2: CITIDEL Distributed repository structure

  33. Example 2: NSDL Collections(themes relevant to our projects) • Discovery of content • Classification and cataloguing • Acquisition and/or linking; referencing • Disciplinary-based themes define a natural body of content, but other possibilities are also encouraged • Software tool suites for analysis, modeling, simulation, or visualization • Reviewed commentary on pedagogy

  34. Contents • Early history • Key concepts • Examples • ODL, XOAI • OAI Tools • Technical Plan • Conclusion

  35. Open Digital LibrariesXOAI-PMH • Dissertation work of Hussein Suleman (member of OAI technical committee) • Extending the OAI protocol • Supporting rapid development of DLs using networks of components • Demonstrated with NDLTD, CSTC • Described in Dec. 2001 D-Lib Magazine article, and article scheduled for publication

  36. Open Digital LibrariesComponents • Running now • XML-File (data provider from file system) • Union, search, browse, recent, filter • E-journal support system • Class projects • High performance multilingual search • Recommender • User rating • Others discussed • Classification/categorization and browsing

  37. Component System Approach • (Open) DL = Network of Extended OAs Data Input Resource Discovery Search Browse Recommend Local Archive Metadata Repository Remote Archive User Interface OAI/ODL archive OAI/ODL protocol legend

  38. Example Architecture (NDLTD) Virginia Tech User Interface PhysNet Humboldt Search Browse Recent Duisburg CalTech Union Catalog Dresden MIT Filter User Interface OAI/ODL archive OAI/ODL protocol legend MIT

  39. Contents • Early history • Key concepts • Examples • ODL, XOAI • OAI Tools • Technical Plan • Conclusion

  40. OAI Tools • Related resources, e.g., XML, Unicode • Submission / author support • XML Schema Validator • Servers and utilities, e.g., ARC, Kepler, EPrints • Repository Explorer • Interactive Browsing • Testing of parameters • Multiple views of data • Multilingual support • Automatic test suite

  41. Author‘s tools www.physik.uni-oldenburg.de/EPS/mmm

  42. XSV Schema Validator

  43. ARC (arc.cs.odu.edu)

  44. VT Tool: Repository Explorer • The Repository Explorer is a tool for browsing and testing Open Archives, by Hussein Suleman • You issue commands and see the results • You also can perform a sequence of automatic tests • http://purl.org/net/oai_explorer

  45. VT Tool: RE 1.3

  46. VT Tool: Request, Response

  47. Contents • Early history • Key concepts • Examples • ODL, XOAI • OAI Tools • Technical Plan • Conclusion

  48. What will central service look like? (1 of 2) • Harvesting from local sites • Rich content, drawn from all participating sites • Data management • Logging and reporting • Repository/preservation/mirroring • Adding/updating/deleting • User interface and support for digital librarians and data providers

More Related