380 likes | 552 Views
DRS 2 Orientation. Harvard University Library September 30, 2010. DRS = Digital Repository Service. Agenda. DRS 2 Concepts (Andrea) New metadata (Robin) Overall schedule (Andrea) BatchBuilder 2 demo (Vitaly) Testing instructions (Vitaly) Questions & comments. DRS 2 Concepts.
E N D
DRS 2Orientation Harvard University Library September 30, 2010 DRS = Digital Repository Service
Agenda • DRS 2 • Concepts (Andrea) • New metadata (Robin) • Overall schedule (Andrea) • BatchBuilder 2 demo (Vitaly) • Testing instructions (Vitaly) • Questions & comments
DRS 1: everything’s a file TIFF image file Text file ZIP file JP2 image file METS XML file TIFF image file METS XML file JPEG image file JP2 image file JP2 image file Text file PDF document file Text file METS XML file TIFF image file JP2 image file JPEG image file
File level is not a meaningful level for curatorial uses… • Which DRS files make up my digital manuscript? • HOLLIS number 009412949 • http://nrs.harvard.edu/urn-3:FHCL.HOUGH:1116980 • http://pds.lib.harvard.edu/pds/view/6522882
TIFF image file Text file ZIP file JP2 image file METS XML file TIFF image file METS XML file JPEG image file JP2 image file JP2 image file Text file PDF document file Text file METS XML file TIFF image file JP2 image file JPEG image file
TIFF image file Text file ZIP file JP2 image file METS XML file TIFF image file METS XML file DRS file ID = 6522882 JPEG image file JP2 image file JP2 image file Text file PDF document file Text file METS XML file TIFF image file JP2 image file JPEG image file
TIFF image file Text file ZIP file JP2 image file METS XML file TIFF image file METS XML file JPEG image file JP2 image file JP2 image file Text file PDF document file Text file METS XML file TIFF image file JP2 image file JPEG image file
TIFF image file Text file ZIP file JP2 image file METS XML file TIFF image file METS XML file JPEG image file JP2 image file JP2 image file Text file PDF document file Text file METS XML file TIFF image file JP2 image file JPEG image file
TIFF image file JP2 image file METS XML file TIFF image file JP2 image file Object
page 1 TIFF image file JP2 image file METS XML file TIFF image file page 2 JP2 image file Object
Objects • Aggregations of files that together represent a coherent unit of content • All the files that make up a single digital book • All the master and use copies representing a single photograph • Useful for management, reporting and searching • “How many PDS document objects do I have in the DRS?”
Objects • New hook for metadata • Administrative categories (projects, exhibits, collections, etc.) • Descriptive metadata, catalog records Hollis # 009412949 Object Digital Medieval Manuscripts at Houghton Library Moralia in Job: manuscript
Content models • Object types • Define • valid file formats and relationships • known delivery and rendering applications • associated assessments and preservation plans • Enforce conformity - we know what we have in the DRS and can monitor & preserve it
DRS 2.1 content models – deposit & delivery • Still image • Image objects, delivered by IDS • PDS document • Page-turned documents, delivered by PDS • Document • Initially just PDF files, delivered by FDS • Opaque • Files in any format • Text • Text, XML, etc. delivered by FDS
Still image CM – print Several derivative JPEG deliverables TIFF archival master Derivative JPEG thumbnail Pope JoanSeries: Illustration from Philippus Bergomensis, De Claribus Mulieribus. Ferrara, Rossi Harvard Art Museum/Fogg Museum, Gift of Philip Hofer
PDS document CM - book JP2 archival master / deliverable images per page … Plain text files per page Zoeller, Karl William. Merchandising the plumbing business. Chicago : Domestic Engineering Co., c1921. Baker Library.
Document CM - report PDF deliverable Intergovernmental Panel on Climate Change (IPCC) WG1 Fourth Assessment Report, Environmental Science and Public Policy Archives Harvard College Library
Opaque content model • The contents of Judge Tragers’ hard drive, Harvard Law School Library • Wordperfect files, Text files, PDF documents, etc. • Plus documentation about the collection
Text CM – methodology Plain text file Processing methodology for Intergovernmental Panel on Climate Change (IPCC) documents, HCL Imaging Services.
Object descriptors • A METS metadata file per object on the file system alongside content files • Descriptive, administrative, preservation, technical and structural metadata • Describes the object, all its files and bitstreams and related significant events • Gives the metadata the same secure storage as the content files • Self-contained, portable objects
The move to standards • PREMIS -- for key preservation metadata, including • Events that affect content • Relationships that are not implicit • MODS -- for descriptive metadata • Form-specific schemas for technical metadata, including • MIX for images • textMD for text • DocumentMD for PDF and other document formats • More to come… • Supplemented by local administrative schemas
New local metadata • adminCategory • adminFlag • captions, phase 2 • Behavior, default, unit name, description for objects • content model identification • DRS URI • isFirstGenerationInDrs • Closest to original capture • isPreferredDeliverableSource
Changes to local metadata • OwnerSuppliedName • Required for objects, optional for files • Role • Repeatable for both objects and files • Processing • Instead of “purpose”; repeatable • Quality • Optional • Methodology • Now for objects and files of all types
Tracking changes • DRS 2 will keep track of • Changes that affect content • Troubleshooting content errors • Key administrative metadata • Three types: • Events • Administrative flags • “Versioned” metadata elements • Not tracking every metadata change
Events • Object • creation • deletion /recovery from deletion • ingest • merge • File • addition • deletion / recovery from deletion • integrity check confirmation • replacement • virus check confirmation
Other tracking Metadata where changes will be tracked: • Access Flag • Administrative Flag • Billing Code • Owner Code
What’s inside a descriptor? Descriptive Metadata MODS Administrative Metadata For the object: PREMIS (including relationships) DRS administrative metadata For each file: PREMIS (including relationships) Format-specific metadata DRS administrative metadata PREMIS Events Inventory of Files Structure Map
Overall schedule • Available now: first release of BatchBuilder 2 for depositor training and testing • Supports 5 content models • Fall 2010 – Summer 2011 • BatchBuilder 2 enhancements & bug fixes • Web Admin 2 development and testing • ~September 2011: BatchBuilder 2 and Web Admin 2 in production
BatchBuilder 2 • Will build batches of objects rather than batches of files • Will automatically determine most technical metadata (using FITS) • Will automatically create all object descriptors (using OTS)