240 likes | 345 Views
Linked Open Data: Opportunities & Barriers for Archives. Archives 360, Society of American Archivists Chicago, USA 26 th August 2011. Adrian Stevenson LOCAH Project Manager UKOLN, University of Bath, UK.
E N D
Linked Open Data: Opportunities & Barriers for Archives Archives 360, Society of American Archivists Chicago, USA 26th August 2011 Adrian StevensonLOCAH Project Manager UKOLN, University of Bath, UK
The goal of Linked Data is to enable people to share structured data on the Web as easily as they can share documents today. Bizer/Cyganiak/Heath Linked Data Tutorial, linkeddata.org
Linked Data Design Issues • URIs • LD Design Issues • Triples http://www.w3.org/DesignIssues/LinkedData.html
Triples • Triples statements • ‘Things’ have ‘properties’ with ‘values’ • Subject – Predicate - Object • Triples are the basis of RDF and Linked Data Is Member Of Provides Access To The Rolling Stones Repository Keith Richards ArchivalResource
LOCAH Project • Linked Open Copac and Archives Hub • Funded by #JiscEXPO 2/10 ‘Expose’ call • 1 year project. Started August 2010 • Partners & Consultants: • UKOLN, Mimas, Eduserv, Talis, OCLC, Ed Summers • http://blogs.ukoln.ac.uk/locah/
What is LOCAH Doing? • Part 1: Exposing Archives Hub & Copac data as Linked Data • Part 2: Creating a prototype visualisation • Part 3: Reporting on opportunities and barriers
Archives Hub Model in Finding Aid Place PostcodeUnit Repository(Agent) administeredBy/administers maintainedBy/maintains encodedAs/encodes hasPart/partOf EAD Document accessProvidedBy/providesAccessTo Level Biographical History topic/page hasBiogHist/isBiogHistFor level Language ArchivalResource language at time topic/page origination hasPart/partOf TemporalEntity Creation product of associatedWith extent inScheme Extent ConceptScheme Concept Agent representedBy Object Is-a foaf:focus Is-a associatedWith Person Family Organisation Place Book participates in Genre Function Birth Death TemporalEntity at time
We’re Linking Data! • If something is identified, it can be linked to • We take items from our datasets and link them to items from other datasets BBC Copac VIAF DBPedia GeoNames Archives Hub
Enhancing our data • Already have some links: • Time - reference.data.gov.ukURIs • Location - UK Postcodes URIs and Ordnance Survey URIs • Names - Virtual International Authority File • VIAF matches and links widely-used authority files - http://viaf.org/ • Names - DBPedia • Also looking at: • Subjects - Library Congress Subject Headings and DBPedia • Open Calais for entity extraction – from ‘bioghist’ field
http://data.archiveshub.ac.uk/id/person/nra/webbmarthabeatrice1858-1943socialreformerhttp://data.archiveshub.ac.uk/id/person/nra/webbmarthabeatrice1858-1943socialreformer
Visualisation Prototype • Using Timemap – • Googlemaps and Simile • http://code.google.com/p/timemap/ • Early stages with this • Will give location and ‘extent’ of archive. • Will link through to Archives Hub
Key Benefit of Linked Data • API based mashupswork against a fixed set of data sources • Hand crafted by humans • Don’t integrate well • Linked Data promises an unbound global data space • Easy dataset integration • Generic ‘mesh-up’ tools
Linked Open Data • Data can be open or closed • Linked Data can be open or closed • Most benefit gained when data is open
Data Modelling • Steep learning curve • RDF terminology “confusing” • Lack of archival examples • Complexity • Archival description is hierarchical and multi-level • ‘Dirty’ Data
Sustainability • Can you rely on data sources long-term? • Ed Summers at the Library of Congress created http://lcsh.info • Linked Data interface for LOC subject headings • People started using it
Scalability / Provenance • Same issue with attribution • Solutions: Named graphs? Quads? • Best Practice Example by Bradley Allen, Elsevier at LOD LAM Summit, SF, USA, June 2011
Licensing • Ownership of data often not clear • Hard to track attribution • CC0 for Archives Hub and Copac data
Is Linked Data the Way? • Enables ‘straightforward’ integration of wide variety of data sources • Archival data can ‘work harder’ • New channels into your data • Researchers are more likely to discover sources • ‘Hidden' archives collections of become of the Web
Attribution and CC License • Sections of this presentation adapted from materials created by other members of the LOCAH Project • This presentation available under creative commonsNon Commercial-Share Alike: http://creativecommons.org/licenses/by-nc/2.0/uk/