1 / 20

Update and Thoughts on Directions for Metadata Work

This presentation provides an update on ongoing metadata activities, user studies, and efforts to understand relevant standards for integrating metadata. The focus is on conceptual tasks related to identifying elements, attributes, and values, and practical tasks in finding and acquiring metadata content. The presentation also explores potential directions for further development.

cvroman
Download Presentation

Update and Thoughts on Directions for Metadata Work

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Update and Thoughts on Directions for Metadata Work Carol Hert March 17, 2003

  2. Our Metadata Activities • User study to understand metadata necessary for integration tasks (we’re finding needs for metadata not available in agencies) • Ongoing efforts to understand DDI and ISO11179 for deploying in end-user tools • Identification of host of other relevant standards (open archives, business XML, Z39.50, …) • Marked-up tables using DDI • Attempting to acquire particular metadata

  3. Metadata Aspects for GovStat • Conceptual Tasks • Determining elements and attributes to be used in wrapping data and contextual info (an XML DTD presumably) • User study et al. to determine appropriate content • “thought” experiments with implementations related to elements, attributes, and their values • Developing conceptual metadata model for SKN • Practical Tasks • Finding the actual metadata content to be “wrapped” via the elements • finding data with metadata to port into tools

  4. Today’s Presentation • Focus on the Conceptual Tasks • Status report on potentially relevant standards and projects • Considering the user tools and the public intermediary • Start strategizing on directions to pursue further

  5. Concept. Task 1: Identifying Elements, Attributes, and Values • Current Contenders for Elements, Attributes (and some values) • DDI (and its implementations) • ISO11179 (and its implementations) • Hybrids • Corporate Metadata Repository (CMR) from Oracle • Data cubes for Tables from NESSTAR, DDI

  6. DDI • Data set is the basic element • Data archives perspective-designed primarily for people who archive data sets and those who will retrieve and reuse those datasets • Does capture information on variables, values, etc. • Still actively working on specifications for tables (see Ryssevik memo 3/6/2003)

  7. DDI Issues • Doesn’t have good mechanism for relating surveys and instances of those surveys-each data set is considered as stand-alone • Hard to compare across variables and time-series • Elements for tables still in development and other data presentations (such as news releases, graphics) not well developed • Currently working backwards to a conceptual model for the metadata

  8. DDI Implementations of Note • Counting California • Virtual Data Center (Harvard/MIT) • NESSTAR/FASTER • Developed CRISTAL datacubes and FasterCubes • Minnesota Population Center • Developed WendyCubes for data cubes • WendyCubes and FasterCubes being merged • Data Ferrett (Census)

  9. ISO11179 • from the data producers’ perspective (Dan argues that it doesn’t take any perspective) • Able to relate survey instances, etc. • Isn’t capable of handling the full range of metadata we might need, nor can it handle data representations such as news releases, webpages, etc. (same problem with DDI)

  10. ISO11179 Implementations • StatCanada • Dan G. has reservations about this implementation and feels it doesn’t meet the standard (more as I understand the problem better)

  11. Is CMR the answer? • CMR as a registry to describe data, data processes, data quality and which links to datasets and data • CMR incorporates all of ISO11179, and DDI, in addition can support a variety of metadata types (those news releases) • CMR not open source, cost unknown (software cost and Oracle consultants) • Two good contacts for us • Dan has gotten for BLS • Sarah Nusser acquiring for Iowa State

  12. Seque to Conceptual Task 2 • My original goal was to determine what metadata elements would be necessary for a given end-user tool (e.g. the SIG) and determine which standard(s) could provide necessary functionality (enabling metadata to get from agencies to the user tools) • I started by looking at the SIG and also at DDI implementations to see what functionalities we could acquire

  13. The Plot Thickens • Two new questions emerged from these activities • What functions/information (data & metadata) would be necessary in SKN • What other standards efforts should be considered in creating the SKN?

  14. The SKN Architecture

  15. INTERNAL TO AGENCIES PUBLIC INTERMEDIARY POSSIBLE SKN USER TOOLS/FUNCTIONS TRANSFERS Agency data production Data archives standards, projects and their functions CMR; Proprietary metadata repositories; Presentation formats (html, xml, pdf, etc.); Database formats (ACCESS, ALMIS ); DDI Datacubes NESSTAR/Faster CRISTAL; XML for Analysis; Common Warehouse Metadata Model; Statistical disclosure (SDC in Nesstar);  StatCan ISO imp. DDI (and DDI for datacubes) NESSTAR/Faster CRISTAL Middleware (whatever that includes) NEOOM from Nesstar/Faster From Virtual Data Center (VDC): federated metadata harvesting, repository exchange and caching, federated authentication and authorization, naming Searching: Z39.50 Data analysis, Bookmarking, Downloading datasets (nesstar); Cataloging, archiving functions (VDC); Online search, data conversion, exploration, data analysis (VDC); Glossary (The Neuchatel Group) Statistical Interactive Glossary (SIG—our project) Ontologies (ISI/Columbia for gas); Relation Browsers; Online Help Z39.50 (used by VDC) Open Archives (VDC) DC, MARC, DDI metadata import and export (VDC) SOAP HTTP RDF (Nesstar) ASN.1

  16. New Strategic Direction for Us? • Specification of metadata necessary throughout SKN? • Will require specification of interactions among components of SKN • And perhaps the specification of specific standards

  17. An example of a possible interaction • User via interface “I want data on gasoline price indices in the state of MD” • Query transferred to intermediary. • Intermediary query agent has business rule requiring check of terms so forwards the term “indices” to the SIG

  18. Example continued • SIG responds with 3 definitions of index (specificity of definition) and multiple display options • Intermediary business rule indicates to take most general and to use the term “index” in queries sent to agency data sources • Etc.

  19. New Strategic Direction for Us? • Specification of functions (and related information) necessary throughout SKN? • Will require specification of interactions among components of SKN (possible queries, acceptable responses, bindings among agents, etc.) • And perhaps the specification of specific standards

More Related