1 / 28

Digital Repository Preservation Service ________________________

Digital Repository Preservation Service ________________________. Digital Dissemination Task Force April 24, 2008 Meg Bellinger, AUL Roy Lechich, Audrey Novak, ILTS. Yale Cyber Infrastructure Architecture. Infrastructure Framework, Protocols, Standards

gaille
Download Presentation

Digital Repository Preservation Service ________________________

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Digital Repository Preservation Service________________________ Digital Dissemination Task Force April 24, 2008 Meg Bellinger, AUL Roy Lechich, Audrey Novak, ILTS

  2. Yale Cyber Infrastructure Architecture Infrastructure Framework, Protocols, Standards Web services, Z39.50, OAI-PMH, RSS, SRU/SRW, OAIS, Fedora Common Services Persistent identification, Authentication & Authorization, Registries, Rights Management Content Provision: Services & Storage For digital collections, preservation, metadata From library, museums, research, academic and administrative departments Users Yale and global Fusion: Services, Tools, Applications Brokers, aggregators, indexes, catalogs, MetaLib, XSearch Presentation: Interfaces Yale uPortal, Classesv2, Google, Personal Information Environment, Discipline specific, gallery, museum and library sites Based on a graphic created by Lorcan Dempsey

  3. Content Sources Dissem-ination Yale University Library Digital Repository Service Google, MSN, Yahoo … Full Text Books Image Commons Collections Environment Finding Aids Classes*v2 (Sakai) Audio & Video Interface out University Portal Integration Services Images and Metadata Library, MetaLib E-Publishing (Institutional Repository) Preservation Archive Complex Objects Collections XSearch . VITAL Research Data Personal Collections Content Metadata

  4. Outline______________________________________ • Introduction • Background • Digital Preservation Repository • Phase I • Additional Phases • Within the Larger Landscape 24 Apr 2008

  5. Intro: What is Digital Preservation?__________________________________________________ “ Digital preservation is the whole of the activities and processes involved in the physical and intellectual protection and technical stabilization of digital resources through time in order to reproduce authentic copies of these resources.” (YUL Digital Preservation Policy) 24 Apr 2008

  6. Introduction: The Need___________________________________________________________________ Statistical Datasets At an ever accelerating pace, faculty, students, and staff (e.g., the Library) are creating, sharing, and storing digital information for teaching, learning, research, administrative, and creative purposes. Mass Digitization Images Information in digital form is now integral to Yale's core mission. Scientific & Biomedical Data Audio, Video, Podcasts Web Sites 24 Apr 2008

  7. Introduction: The Need__________________________________________________ • Digital resources are fragile and the preservation of these resources is complex. • Digital preservation is dynamic • Responses to technological obsolescence or media decay must be taken quickly. • Digital preservation is pro-active • Rather than reactionary and the prospects for successfully preserving digital resources rest heavily upon decisions taken at each stage of their life cycle starting with creation. 24 Apr 2008

  8. Introduction: The Need_____________________________________________________ Digital Landscapes Committee, Cyberinfrastructure Survey (Oct 2006) Ranking from 19 survey questions posed to faculty: #1 Easierelectronicaccessto scholarlymaterials #2Providingstudentswithdigitalaccesstoresearch andinstructionalmaterials #11 Ensuringthepreservationofmyscholarlydigital output(e.g.,datasets,researchnotes,e-prints) 24 Apr 2008

  9. Introduction: The Need_____________________________________________________ “The coolest thing that will be done with your data someone else will do.”Open Repositories 08 24 Apr 2008

  10. Background – YUL Related Initiatives_____________________________________________________ • IAC Rescue Repository • 2004 - present • IAC Digital Preservation Committee • Nov 2004 - Jan 2007 • IAC Metadata Committee • Nov 2004 - Feb 2007 • PREMIS - Preservation Metadata Task Force • April - Oct 2006 24 Apr 2008

  11. Rescue Repository (May 2004 Requirements Report)______________________________________ “An increasing number of projects in the YUL are generating or acquiring digital content …” “The digital masters for much of this material are in immediate danger of permanent loss through media decay, physical damage, technological obsolescence, or difficulties in archival management..." "...in the interim, we propose a flexible and agile/quick short-term solution…" 24 Apr 2008

  12. Rescue Repository Description_____________________________________________________ • Managed, secure storage (disk-to-disk-to-tape). • Resources are organized according to owning library, collection, subcollection(s), file name. • Activity is managed by simple ingest and retrieval applications with basic file verification and validation. • A ~3 year temporary solution (May 2005 +3 yrs). • Heavily used … 24 Apr 2008

  13. Users: BRBL, Div, E-Collections, Geo, LWL, MSS/A, Peabody, Preservation, SSL, VRC, YUAG 24 Apr 2008

  14. Digital Preservation Committee___________________________________________________________________ • Preservation Policy – Defines digital preservation; establishes general principles about what is preserved; promulgates our commitment to standards. • Best Practices – A dynamic suite of documents that address current best practices for preservation-related issues such as format validation, registries, etc. 24 Apr 2008

  15. Metadata Committee____________________________________________ Preservation Metadata Taskforce (PREMIS) Report • PREMIS (PREservation Metadata Implementation Strategies) defines the metadata needed to preserve digital information assets for the long term. 24 Apr 2008

  16. Digital Preservation Need and Related Initiatives Summary_____________________________________________________ • The demand for a Digital Preservation Repository from faculty, Rescue Repository users, digitization operations and projects is heavy. • The Rescue Repository and work by the IAC Digital Preservation and Metadata/PREMIS Committees laid the foundation. • Rescue Repository is reaching its planned end to life. 24 Apr 2008

  17. Digital Preservation Repository: Phase I_________________________________________________________ • $500,000 funding from the Provost to establish a Digital Preservation Repository prototype. • Provide mechanisms and services for preservation and access to the data. • Create the scalable hardware infrastructure. • Demonstrate an extensible repository service model. • Develop the resource (staff and economic) models. • Establish the collaborative campus partnerships. • Further the research and scholarship into digital preservation issues. 24 Apr 2008

  18. Digital Preservation Repository: Phase I_________________________________________________________ Working from two Use Cases: • YPED (Yale Protein Expression Database)* • Protein profiling mass spectrometry data sets generated by the Keck Lab • Images from the Rescue Repository • Approximately 400,000 individual image files from the Art Gallery, Beinecke, Divinity Library, Lewis Walpole Library, Library Visual Resources Collection, and Manuscripts and Archives department. • Proteomics is the large-scale study of proteins and is often considered the next step in the study of biological systems, after genomics. 24 Apr 2008

  19. Digital Preservation Repository: Phase I_________________________________________________________ • Hardware Architecture • Software Design • Preservation Metadata • Use Case: YPED • Use Case: Images 24 Apr 2008

  20. Phase I - Hardware______________________________________ • 20TB YPED and Images • 30TB Microsoft mass digitization • 10TB non-images (Rescue Repository) • 40TB Annual growth with Library digitization projects _________ • 250TB Annual growth with Fortunoff video digitization project • 1000TBs (a petabyte) within 5 years • Others? 24 Apr 2008

  21. Phase I - Hardware______________________________________ 24 Apr 2008

  22. Yale Cyber Infrastructure Architecture Infrastructure Framework, Protocols, Standards Web services, OAI-PMH, OAIS, Fedora, METS, MVC, SOA Common Services Persistent identification, Authentication & Authorization, Registries, Rights Management Content Provision: Services & Storage For digital collections, preservation, metadata From library, museums, research, academic and administrative departments Users Yale and global Fusion: Services, Tools, Applications Brokers, aggregators, indexes, catalogs, MetaLib, XSearch Presentation: Interfaces Yale uPortal, Classesv2, Google, Personal Information Environment, Discipline specific, gallery, museum and library sites Based on a graphic created by Lorcan Dempsey

  23. Software Design___________________________________________________ • Phase I - Core Preservation Functionality • Deposit, Normalization, Packaging, Validation, Ingest, Storage (multiple copies, geographic separation), Preservation Policy Management, Authorization, OAI-PMH, SRW/SRU, Retrieval • YPED and Image Use Case Requirements • Additional Phases - More Services • Preservation actions • All (or almost all) user-facing services • Enhanced access & delivery through applications 24 Apr 2008

  24. Flexible • Accept Different Types of Data • Collect Data and Metadata Components • Normalize for Ingest Processing • Verify Integrity • Add Identifiers • Add Preservation Metadata • Continuous Integrity Checks • Format Migrations (e.g. .tiff to .jp2000) • Storage Migrations (to new or different type physical media) • Logging • Reporting • Authorization • Validation • OAI-PMH • SRW/SRU • Indexing • Retrieval • Logging Deposit / Ingest Preservation / Storage Access SIP AIP DIP Repository

  25. Digital Preservation Repository – Phase ISummary _____________________________________________________ Build: • Hardware environment • Core preservation repository services • Project specific service components needed for YPED and to replace Rescue Repository • Migration of Rescue Repository image content 24 Apr 2008

  26. Additional Phases_____________________________________________________ Examples: • Full Rescue Repository migration • More content (project/use cases) • Project specific ingest and access • More storage (950TBs) • Preservation actions (integrity checks, format migrations, etc.) • Reporting • Rights Management 5 years, 6FTE, ~7 million dollars 24 Apr 2008

  27. Larger Landscape____________________________________________ Peer Institutions: • Stanford, Harvard • Rutgers • DAITSS (Florida) • Michigan • Columbia Internationally: • European National Libraries • Australia & New Zealand 24 Apr 2008

  28. Thank you Q&A 24 Apr 2008

More Related