1 / 64

The Metadata Landscape for Digital Video

The Metadata Landscape for Digital Video. Grace Agnew August 1, 2005 National Library of Medicine. Definition of Metadata. Data about Data Data that describes, defines or manages data. “ Pure” metadata has meaning only in relation to the primary data that is being described.

Download Presentation

The Metadata Landscape for Digital Video

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Metadata Landscape for Digital Video Grace Agnew August 1, 2005 National Library of Medicine

  2. Definition of Metadata • Data about Data • Data that describes, defines or manages data “Pure” metadata has meaning only in relation to the primary data that is being described.

  3. METADATA MAY BE: • auto-generated • automatically harvested from the resource • human-created • end user • metadata creator/manager • computer application/program AUDIENCE MAY BE:

  4. Data Model: • Abstract characterization or “World View” of the data: • -- relationships between objects in the model • -- “living” data—events occur in the lifecycle of each object in the model • --context independent—so that any context can be supported

  5. ORGANIZATION’S INFORMATION MODEL ENTITIES Metadata - Educational Objects - Metadata Creators - Users ATTRIBUTES Identify, Define Entities RELATIONSHIPS One to one; One to Many ; Parent, child, sibling ; Inheritance MODEL Relationships between Entities within a Domain

  6. The Structure of Information (IFLA) Work Distinct intellectual or artistic creation Intellectual or artistic realization of a work (“interpretation”) Expression Expression Physical manifestation of an expression. May differ in physical format, but not in content or interpretation Manifestation Manifestation Manifestation Unique physical instance of a manifestation. Item

  7. WORK A B S T R A C T I O N GONE WITH THE WIND Intellectual / artistic content EXPRESSION Novel Script Movie Interpretation 70 MM Film MANIFESTATION Paper Physical recording of content 35 MM Film PDF DVD HTML MPEG2 ITEM Copy in Blockbuster, Atlanta, GA Single physical representation of a recording 24 Reels of film, MGM Archive

  8. OAIS INFORMATION MODEL P r o d u c e r C O N S U M E R Descriptive Info DI DI DIP SIP SIP Ingest Access/ Dissemination AIP AIP Archival Storage OAIS - Reference Model for an Open Archival Information System From: CCSDS 650.0-R-1: Reference Model for an Open Archival Information System (OAIS). Red Book. Issue 1. May 1999. PDF.Available at: http://ssdoo.gsfc.nasa.gov/nost/isoas/overview.html

  9. End - to - End Metadata Implementation Record Structure Database Population Data Element Registration Data Model Data interchange (other repositories) Repository Design Dissemination to Users

  10. Inside the Digital Information Repository Persistent Objects: Manage objects through changes to: hardware, software, players, search&retrieval systems, etc. Persistent Metadata: Manage metadata through schema and data element versioning changes, new metadata formats, I&R changes, hardware & database migrations.

  11. Key Issue for Preservation • Authenticity -- integrity “digital document must be whole and undisturbed” --provenance – must be tightly associated with its creator and act of creation In the analog space Object in hand is compared with a conceptual (“canonical”) historical version Gladney and Bennett. What do we mean by authentic? http://www.dlib.org/dlib/july03/gladney/07gladney.html

  12. Authenticity • In the digital space • -- Fidelity to the source artifact • -- Identical (true/false) to the digital • canonical master • --accompanied by a “true” provenance • statement • --Proof: digital signature verifying that canonical object is unchanged. Digital audit trail documenting provenance and any changes to artifact or chain of provenance

  13. Administrative metadata: provenance, fixity, context, reference, and lifecycle management. Rights MD may be a subset. Technical Metadata: physical characteristics of the resource. Used to manage digital preservation and display of resource. May be a subset of Administrative MD. Also called Preservation Metadata Descriptive Metadata: - information to discover, identify, select and obtain the resource Metadata Managing the Resource

  14. Structural metadata: - Information about the structured relationship between components of a complex object. May be a subset of Administrative MD. Meta metadata: metadata that describes and manages the metadata record. Can add “intelligence” to metadata. Metadata Managing the Resource Repository design concatenates all types of metadata to support preservation and access to objects in the repository

  15. METADATA SCHEMA COMPONENTS • Data Element - Atomic Unit of Meaning- Community Defined • Attribute - Refines, Extends, Interprets data element • Value - Information unique to each data element instance • Constraint - Order imposed on data element expression for consistency; semantic viability • Label - contextual instance of data element name. “How the data element displays on the web for the end user.”

  16. OAIS – Preservation and Access File Encoding and Transport • METS: Metadata Encoding & Transmission Standard • XML document format for encoding metadata for resource description and management. • “wrapper” that concatenates digital object(s) in multiple formats, metadata, a structure map documenting the organization of the digital object(s), as well as behaviors that act upon digital object(s) • standardized transmission of METS package between repositories and applications

  17. METS: Metadata Encoding & Transmission Standard METS Document has seven major sections: METS Header: minimal descriptive metadata about the METS document itself Descriptive Metadata: metadata describing the digital object, to enable discovery and evaluation. Administrative Metadata: metadata about the creation, use and provenance of the digital object(s). Includes four subtypes: technical, source, rights and digital provenance metadata

  18. METS: Metadata Encoding & Transmission Standard File Section: Includes one or more <fileGrp> elements, to group together related files, such as the different digital manifestations of a file, e.g.,the uncompressed digital master, mpeg4 and Quicktime access files, for a video title. Structural Map: Outlines hierarchical structure of a digital object and links the elements of that structure to relevant content files and metadata

  19. METS: Metadata Encoding & Transmission Standard Structural Links: Contains a single element, <smLink>. Used to record the existence of hyperlinks between items within the structural map. Behavior: Used to associate executable behaviors with content within the METS document. For example, a behavior could automatically launch a video player application when a digital video file is selected for display.

  20. FEDORA • Background: • “Flexible, Extensible Digital Object Repository Architecture” • Developed by Cornell University and University of Virginia via a Mellon Foundation Grant. • Utilizes METS (v 2.0 – FOXML (Interoperable with METS) • http://www.fedora.info/

  21. PREMIS Data Dictionary • Sponsored by OCLC and RLG • Defines a “core” set of preservation metadata elements • Provides a data dictionary supporting the preservation of digital information

  22. PREMIS Data Model Intellectual Entities Rights Agents Objects Events http://www.oclc.org/research/projects/pmwg/

  23. MPEG-21 Multimedia Framework • Transparent management and use of digital multimedia resources, from creation through consumption. • Key concept is the Digital Item Declaration, which includes structure, resources and metadata bundled in the item. • Repository architecture—LANL’s aDORe—modular digital object repository architecture modeled on MPEG21. • http://public.lanl.gov/herbertv/papers/aDORe_20050128_submission.pdf

  24. MXF: Multimedia Exchange Format` • “Open file format targeted at the interchange of audiovisual material, with associated data and metadata.” • Intended to support file interoperability between content creation devices, servers and workstations. Supports integration of file-based and streaming resource formats. • Maintains the “documentation chain” for metadata about audiovisual essences throughout the resource lifecycle—creation, broadcast, storage, re-use

  25. MXF: Multimedia Exchange Format Example: Video footage of hurricane activity in the field has automatic GPS, date/time and duration capture as captions on the footage. MXF can maintain the essence and the metadata captured simultaneously by the camera for use in production, archiving and reuse, without the need to “recatalog” the information. Example: Footage of jaguar hunting in Brazil is captioned in the field, transferred with captions to production facility, where it is packaged into a program, “The Vanishing Rainforest.” Footage is licensed to a travelog production company. Footage of jaguar on the DVD, “This is Brazil” has online attribution to “The Vanishing Rainforest,” from metadata added in production, as well as attribution to the the field cinematographer, location, date and time of capture, from the original captions, with no recreation of metadata.

  26. MXF: Multimedia Exchange Format Header partition pack Header metadata Essence Container Footer partition pack File Body File Header Every item in MXF File is KLV (Key Length Value) encoded—identified by a unique 16-byte key and by its length. Anything that is not understood or needed (unrecognized keys) can be ignored and skipped over File Footer

  27. MXF: Multimedia Exchange Format • Header Metadata: • Metadata (DMS-1 or other schema) • Timing and synchronization parameters • Synchronization and Description of the Essence through three packages: • Material Package: Output timeline of the file (tracks and sequence) • File Package: the essence itself • Source Package: Derivation of the essence (“source film stock” descriptions, etc.

  28. DESCRIPTIVE METADATA SCHEMAS Dublin Core Every element is optional, repeatable, with rules for format and values From “Description of Dublin Core Elements” http://purl.oclc.org/metadata/dublin_core_elements

  29. Provides a great deal of flexibility. Easy to learn. Ensures interoperability with other schemes. Good transport protocol when expressed as XML DUBLIN CORE + • Lacks support for multiple formats • Lacks support for seriality • Technical description (formats, containers, extent, etc.) is weak and not standardized. -

  30. PBCore Intended to address description, preservation and access needs of television, radio, and associated web activities. Based on Dublin Core—qualifies and expands the 15 Dublin Core data elements. 58 Data Elements (30 mandatory) V 1.0 available free of charge for use, via the Corporation for Public Broadcasting. Maps readily to other schema (Dublin Core, MPEG-7, MODS, etc.)

  31. PBCore

  32. PBCore • Data elements address descriptive and technical metadata for access and management • Simple “linear” data model is easy to apply • Like Dublin Core, does not address issue of “multiple manifestations” (Although both can be used within METS to address this issue).

  33. PBCore – “Qualified” Dublin Core for DV <FormatFileSize>296 MB </FormatFilesize> <FormatImageFrameRate>30 fps</FormatImageFrameRate> <format>296 MB</format> <format>30 fps</format> DC “Dumb Down”

  34. Synchronization between content and description Textual indexing: Creation information, subjects, concepts, media profiles. Non-textual indexing - melody and speech recognition, color, shape, scene changes, etc. Textual format/Binary Format completely equivalent. You can use any functionality in textual or nontextual form. MPEG-7: Multimedia Content Description Interface

  35. MPEG-7: Multimedia Content Description Interface • Does not support description of analog or textual resources • High-level textual description of component parts (“table of contents”) does not exist. • Some duplication of descriptive information across MPEG7 descriptive schemes • Documentation, examples and widespread adoption as a descriptive metadata standard is weak.

  36. MPEG-7

  37. Video segments Still regions • Color • Shape • Position • Texture • Color • Camera motion • Motion activity • Mosaic Moving regions Audio segments • Color • Motion trajectory • Parametric motion • Spatio-temporal shape • Spoken content • Spectral characterization • Music: timbre, melody MPEG-7 Content Description:Low level Audio Visual descriptors

  38. MPEG-7 Description Tools Description Schemes (structure) and Descriptors (features)

  39. Dublin Core vs. MPEG7 – The Challenges • MPEG7 is a structured, hierarchical schema. • “Work” described in CreationInformation DS • Manifestation/Item described in MediaInformation and UsageInformation DSs • Dublin Core is a “flat” schema that mixes “work” or intellectual content with single manifestation/item description • (“1:1 principle”)

  40. MANIFESTATION in DC and MPEG-7 CREATOR TITLE CreationInformation SUBJECT DATE MediaProfile UsageAvailability MediaProfile UsageAvailability IDENTIFIER IDENTIFIER FORMAT FORMAT MediaInstance RIGHTS RIGHTS

  41. MODS: Metadata Object Description Schema • XML representation of MARC21 data, to enable seamless transfer of MARC data to XML. • Enables both original description of digital and analog resources and mapping of legacy metadata in MARC to MODS • MODS is represented in application profiles for METS Descriptive MD and OAI-PMH for data sharing and transport

  42. MXF DMS-1 Material Exchange Format – Descriptive Metadata Scheme-1 (SMPTE 380M-2004) • Utilizes SMPTE RP 210 –Metadata Dictionary Registry of Metadata Element Descriptions • Data model and core rules are taken from AAF, so that DMS-1 can be seen as an Application of AAF. • Utilizes a collection of descriptive metadata frameworks. • Supports migration of DM from one MXF file to another when essence is migrated or reused.

  43. MXF DMS-1 • Frameworks: “grouping of related descriptive metadata properties and sets, which describe the contents of an MXF file body.” • Production framework: “provide[s] identification and ownership details of the audio-visual content in the file body.” “Applies to the complete input or output of the MXF file as a whole.” • Clip framework: “provide[s] capture and creation information about the individual “audio-visual” clips in the file body. “A ‘clip’ is a continuous essence element, or essence element interleave, in the essence container.

  44. MXF DMS-1 • Scene framework: “describe[s] actions and events within individual scenes of the aufio-visual content of the file body.” “Scene is an editorial concept and describes a continuous section of content in an MXF file.”

  45. MXF DMS-1 Production framework Metadata Server Locator Publication Event Titles Captions Description Annotation Identification Classification Group Relationship Annotation Cue Words Branding Setting/Period Related Material Locator Award Contract Participant Picture Format Rights Project

  46. MXF DMS-1 Name-value Clip framework Metadata Server Locator Classification Annotation Titles Cue Words Related Material Locator Scripting Captions Description Shot Picture Format Scripting Locator Processing Contract Device Parameters Project Key Point Participant Cue Words Name-value Rights

  47. Name-Value MXF DMS-1 Scene framework Metadata Server Locator Classification Titles Cue Words Annotation Setting period Related Material Locator Shot Participant Contacts List Key Point Cue Words

More Related