810 likes | 1.29k Views
Paul Miller UK Office for Library & Information Networking p.miller@ukoln.ac.uk. Thomas Hofmann Australian Museums On-Line thomash@amol.org.au. Dublin Core for Museums Day 1. CIMI John Perkins jperkins@cimi.org. Overview for Thursday March 25. Introduction to Metadata
E N D
Paul MillerUK Office for Library & Information Networkingp.miller@ukoln.ac.uk Thomas HofmannAustralian Museums On-Linethomash@amol.org.au Dublin Core for MuseumsDay 1 CIMI John Perkinsjperkins@cimi.org
Overview for Thursday March 25 • Introduction to Metadata • Introducing the Dublin Core • CIMI DC Guidelines - Dublin Core for Museums • Break • DC for museums continued... • Lunch • Practicalities of Implementing DC • Break • Introduction to MICI
What’s the Problem? • Need to serve a Web audience • Demand for content • Uncertain quality • Expectations for rapid easy access • Need to be visible on the Web • Two million web sites • Half a billion addressable pages • Many communities with the same problem
What’s the Problem? • Manage and organise interconnected data • Different types • Different repositories • Packages • Interoperate with other communities • Interoperate with other applications • Need a way to: • Express meanings in rich and complex data • Express the structure of our data • Encode the transfer of data
What’s the Solution ? • Communities address their own needs • Do so in a way that works across communities • Standards based • Collaborative
A resource description community is characterised by agreed semantic, structural and syntactic conventions for exchange of descriptive information Libraries Museums MARC AACR2 MICI SPECTRUM What is a Community? Based on a slide by Stu Weibel
Commerce Home Pages Libraries Geo ‘Internet Commons’ Scientific Databases Museums Whatever... Communities working together Based on a slide by Stu Weibel
Metadata Metadata Metadata Museums Metadata Communities working together Metadata Based on a slide by Stu Weibel
What is Metadata? • Meaningless jargon • ora fashionable term for what we’ve always done • or“a means of turning data into information” • and“data about data” • andthe name of a film director (‘Luc Besson’) • and the title of a book (‘The Lord of the Flies’).
What is Metadata? • Metadata exists for almost anything • People • Places • Objects • Concepts • Databases • Web pages
What is Metadata? • Metadata fulfils three main functions: • description of resource content • “What is it?” • description of resource form • “How is it constructed?” • description of issues behind resource use • “Can I afford it?”.
What is Metadata? • Many structures have evolved at different levels, and to meet different requirements... MICI
For human communication we need... SemanticInteroperability Standardisation ofcontent “cat milk sat drank mat ” “Let’s talk English” StructuralInteroperability Standardisation ofform “Here’s how to make a sentence” “Cat sat on mat. Drankmilk.” SyntacticInteroperability Standardisation ofexpression “These are the rulesof grammar” “The cat sat on the mat.It drank some milk.”
Functions, features, and cool stuff Simplicity and interoperability Challenges Opportunities • Many flavours of metadata • which one do I use? • Managing change • new varieties, and evolution of existing forms • Tension between functionality and simplicity, extensibility and interoperability
Introducing the Dublin Core • An attempt to improve resource discovery on the Web • now adopted more broadly • Building an interdisciplinary consensus about a core element set for resource discovery • simple and intuitive • cross–disciplinary • international • flexible.
Introducing the Dublin Core • 15 elements of descriptive metadata • All elements optional • All elements repeatable • The whole is extensible • offering a starting point for semantically richer descriptions • Interdisciplinary • libraries, museums, government, education... • International • available in 20 languages, with more on the way.
Introducing the Dublin Core • Title • Creator • Subject • Description • Publisher • Contributor • Date • Type • Format • Identifier • Source • Language • Relation • Coverage • Rights http://purl.org/dc/
Creator First Name Surname Extending DC (semantic refinement) Improve descriptive precision by adding sub–structure (subelements and schemes) Element qualifier Value qualifier Greater precision = lesser interoperability Should ‘dumb down’ gracefully Affiliation Contact Info Based on a slide by Stu Weibel
Terms & Conditions Description Archival Management Extending DC (a modular approach) • Modular extensibility... • additional elements to support local needs • complementary packages of metadata • …but only if we get the building blocks right Based on a slide by Stu Weibel
Extending DC? • DC offers a semantic framework • through use of further substructure, meaning can often be clarified John Inc. ? John xyz ? xyz John ? <Creator> “John” • John Inc. • John xyz • xyz John. <Creator> <fore name> “John”
http://gii.getty.edu/tgn_browser/ Extending DC? • DC offers a semantic framework • Use of domain–specific schemes greatlyincreases precision Washington State ? Washington DC ? Washington monument ? <Coverage> “Washington” • Washington State • Washington DC • Washington monument <Coverage> <TGN> “Washington” “North and Central America, United States, Washington”
Dublin Core in the physical world • Dublin Core originally designed with electronic resources in mind • Physical resources are fundamentally different • Issues of surrogacy become more important • Genre, Type, and Format models vary greatly • Difficult to remember what is being described, and which characteristics of the resource and its surrogates are ‘correct’.
Introducing Physical Objects • Aspects of the real world are keyto much of what museums do • Physical objects have dimensions • 23 x 46 cm • 12 x 52 x 18 in • 18.6 cm3 • 823 pages • Physical objects have a form • oil on canvas • Tadcaster limestone • stainless steel.
Introducing Physical Objects • Physical objects change over time • constructed between AD524 and 873 • repaired in AD1270 • incorporated into ornamental arch in AD1320 • Physical objects move • cast in Beijing • used in Shanghai • taken to Hong Kong • on display in Macau.
Introducing Physical Objects • Physical objects are associated with people • written by William Shakespeare • acquired by Lord Elgin • decreed by the Emperor Hadrian • associated with Prince Charles Edward Stuart • Physical objects are contextualised • fired at the Battle of Trafalgar • carried on Apollo 11 from the moon • printed on the first printing press • salvaged from the Titanic.
Introducing Collections • Museum objects, whether original orsurrogate, are normally part of a collection • Collections may be ‘real’... • the Sutton Hoo hoard • the Terracotta Warriors • ...an aspect of the process by which objects enter the museum... • the Burrell Collection • Solomon Guggenheim’s art collection • …or simply practical • coins at the British Museum • the Tate Gallery’s collection of works by Da Vinci.
Introducing Surrogacy • Many of the resources we describe are, in reality, surrogates for something else • a photograph of King Tutankhamen’sdeath mask • a photograph of a statue of George Washington • a film of President Kennedy’s assassination • a sound recording of Neil Armstrong’s “Onesmall step for man…” speech on the moon • a copy of the Mona Lisa • a model of the Great Wall of China • a reproduction of the Terracotta warriors.
Issues of Surrogacy • Many of the resources we describe are, in reality, surrogates for something else • we need to be clear whether we aredescribing the resource or its surrogate • the sculptor of a statue is often not the person who made its photographic surrogate • the model of the Forbidden City is unlikely to have been created at the same date as the Forbidden City itself • the format of a computer image of the Mona Lisa (image/jpeg ?)is not the same as the format of the original painting (oil on canvas ?).
Other Museum Issues • Museums need to describe real objectsand surrogates in a similar manner • guidelines/standards therefore need to encompass both, despite their differences • Resource descriptions will often be drawn from existing collection management systems in the first instance, rather than created afresh • guidelines therefore need to respect existing practices within established systems • There is often no ‘right’ answer • so practices need to allow for approximate dates, multiple possible creators, etc.
Introducing the 1:1 Principle 1 : 1 • The broader Dublin Core community is tackling some of the problems relevant to museums • Their work on the ‘1:1 Principle’ is especially useful in resolving museum issues over original versus surrogate and item versus collection: • each Dublin Core ‘record’ should describe only one resource, whether surrogate or original. Associated resources should be linked together by means of the Relation element in Dublin Core.
Introducing the 1:1 Principle 1 : 1 • In a record describing a photo of the Mona Lisa on a web page, for example… • Leonardo da Vinci is not the creator of the image • The image was not created during the Renaissance • …but you might include these as Subject terms, and you could usefully provided a link to the record describing the real painting via Dublin Core’s Relation element • Equally, in describing the painting itself… • http://www.louvre.fr/…/monalisa.jpg is not the Identifier of the painting • but you might link to this image via Relation, just to show people what the painting looks like.
The primacy of ‘Type’ • In describing museum objects, it is often most useful to first decide whatyou are describing and why, rather thanbeginning with ‘who made it’ and ‘what is it called’, as is often the case with books • if you know you’re describing a surrogate of the Mona Lisa, then you know Leonardo da Vinci is not the Creator; whoever made the surrogate is • if you know you’re describing a collection of 20th century paintings, then you know that Picasso, Hockney et al are not the Creators; the collector is.
The primacy of ‘Type’ • if you know you’re describing the Sutton Hoo helmet, then the fact that it was added to a particular museumcollection in 1939 perhaps doesn’t matter;that information is better placed in the collection record • if you know you’re describing a natural specimen, then perhaps it has no Creator; there may be a ‘creator’ associated with its identification or collection, though.
Dublin Core for Museums: Assumptions • In applying Dublin Core to museums, we aremaking certain basic assumptions, many of which were tested by CIMI • DC is appropriate for use in describing both physical and digital resources • DC is easy to learn and simple to use • Information can be meaningfully and efficiently extracted from existing museum systems in order to populate DC records • the creation of a DC record to describe a museum object is cost–effective, and aids the discovery of resources more than simply allowing access to the underlying Collection Management system might.
Practicalities of Implementing Dublin Core Paul MillerUk Office for Library & Information Networkingp.miller@ukoln.ac.uk Thomas HofmannAustralian Museums On-Linethomash@amol.org.au
Overview • Creation and Maintenance • Harvesting and Distribution • Retrieval • Implementation Models • Case Study
Dublin Core - Refresher • 15 simple elements • Focus on Resource Discovery not Resource Description • One Dublin Core record per resource • Interoperable across communities • Can be easy populated from existing databases • Can be formatted in XML/ RDF or HTML
When should I use Dublin Core? • You have a rich standard, need simpler one • You want to disclose your data to other communities using commonly understood semantics • You want to provide unified access to databases with different underlying schemas • You need core description semantics and don’t feel compelled to invent them anew
• Creation and Maintenance tools educate • Harvesting/ Distribution tools • Retrievaltools consensus interface design Considerations
Encoding Dublin Core • HTML • Unqualified • Easy • Qualified • Overloaded Content (HTML 3.2) • Additional Attribute (HTML 4) • RDF • Based on XML • Sophisticated • More complex
Encoding Dublin Core - Unqualified <HEAD> <METANAME="DC.TITLE" CONTENT="My Web Page"> <METANAME="DC.Subject" CONTENT="Computers,Metadata"> </HEAD>
Encoding Dublin Core - Qualified (HTML 3.2) <HEAD> <METANAME="DC.Subject" CONTENT="(SCHEME=AAT)(LANG=EN)Statue, Granite"> </HEAD>
Encoding Dublin Core - Qualified (HTML 4) <HEAD> <METANAME="DC.Subject" SCHEME="AAT" LANG="EN" CONTENT="Statue, Granite"> </HEAD>
Encoding Dublin Core - Sub-Elements <HEAD> <METANAME="DC.Date.Created" CONTENT=" (SCHEME=ISO8601) 1999-03-01"> <METANAME="DC.Date.Modified" SCHEME="ISO8601" CONTENT="1999–03–25"> </HEAD>
Encoding Dublin Core - RDF ... <?xml:namespace href="http://iso.ch/8601/" as="ISO"?> <RDF:RDF> <RDF:Description …> <DC:Date> <RDF:Description> <ISO:date>1999–03–25</ISO:date> </RDF:Description> </DC:Date> <RDF:Description> </RDF:RDF>
Example Tool: DC Dot • http://www.ukoln.ac.uk/metadata/dcdot/ • Semi-automated generation of Dublin Core • Cut and past into document • Conversions to HTML, SOIF, XML, WHOIS++, USMARC, GILS
Example Tool: DC Dot Screenshot of http://www.ukoln.ac.uk/metadata/dc-dot/
Example Tool: DC Dot Screenshots of DC Dot output
Example Tool: Reggie • http://metadata.net • Generic creation tool for any metadata schema published to metadata.net • Currently supports: Dublin Core in 5 languages • Syntax: HTML META tags (V3.2 and 4.0), RDF