1 / 16

Introduction to linked data

Introduction to linked data. Gordon Dunsire Presented at the Cataloguing and Indexing Group Scotland seminar “Linked data and the Semantic Web: what have libraries got to do with it?”, Edinburgh, National Library of Scotland, 17 June 2011. Overview. Relational records

tangia
Download Presentation

Introduction to linked data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to linked data Gordon Dunsire Presented at the Cataloguing and Indexing Group Scotland seminar “Linked data and the Semantic Web: what have libraries got to do with it?”, Edinburgh, National Library of Scotland, 17 June 2011

  2. Overview • Relational records • Influence of RDA vocabularies • Disaggregated, distributed “records” • Logical conclusion: simple metadata statement • RDF • Triples, etc. • Linked data • Chains, clusters

  3. Bibliographic record: 12345 Name authority record: 8765 Title: Cataloguing is fun! Heading: MacDonald, Mary Author: Mary MacDonald 8765 Place of birth: 9876 Edinburgh Content type: text 1234 LCSH authority record: 5432 Carrier type: microfiche 5432 Heading: Cataloging LCSH: 5432 Cataloging See also: 65443 Books RDA content type record: 1234 Term: text Definition: Content expressed through a form of notation for language intended to be perceived visually. RDA carrier type record: 5432 Term: microfiche Definition: A sheet of film bearing a number of microimages in a two-dimensional array.

  4. Bibliographic record: 12345 Name authority record: 8765 Title: Cataloguing is fun! Heading: MacDonald, Mary Author: 8765 Place of birth: 9876 Content type: 1234 Carrier type: 5432 Stop! Ambiguous: link not safe. LCSH: 5432 Identifier: ok to link. 9876 Country 4567 8765 9876 Heading Name “Edinburgh” “MacDonald, Mary” 8765 Place of birth 9876 12345 Author 8765

  5. Linked data is not a new idea! • It extends concepts of authority control • “Preferred” labels • Change once; link many times • Re-use of metadata • More than one “attribute” associated with a “heading” • E.g. Place of birth of person with name heading • Concepts can be applied to authority records • As well as bibliographic description records • Full extension leads to “record” dis-aggregation • All “records” in bibliographic control systems

  6. Linked data and RDF • Resource Description Framework (RDF) • Designed for machine-processing of metadata at global scale • 24/7/365 • Trillions of operations per second • Everything must be dis-ambiguated • Machines are dumb • Simplicity helps! • Machine-readable identifiers

  7. RDF triple • Metadata expressed as “atomic” statements • A simple, single, irreducible statement • The title of this book is “Cataloguing is fun!” • Constructed in 3 parts • “Triple” • The title of this book is “Cataloguing is fun!” • Subject of the statement = Subject: This book • Nature of the statement = Predicate: has title • Value of the statement = Object: “Cataloguing is fun!” • This book – has title – “Cataloguing is fun!” • subject – predicate - object

  8. Identifiers • Need unambiguous way of identifying each part of the triple for efficient machine-processing • Human labels (“This book”, “has title”) no good • Same thing, different labels; different things, same label • Exploit the utility of the URL • Machine-readable, regular syntax, unambiguous • Uniform Resource Identifier (URI)

  9. Uniform Resource Identifier • Can be any unique combination of numbers and letters • No intrinsic meaning; it’s just an identifying label • Can look like a URL • http://iflastandards.info/ns/isbd/elements/P1001 • But does not lead to a Web page (in principle ...) • RDF requires the subject and predicate of triple to be URIs • Object can be a URI, or a literal string (“Cataloguing is fun!”)

  10. Namespaces • URI can be constructed from a base plus a unique, identifying suffix • http://iflastandards.info/ns/isbd/elements/ • + P1001 • Base is known as a namespace • Can be abbreviated by human programmer • “isbd” = http://iflastandards.info/ns/isbd/elements/ • isbd:P1001 • Machine expands abbreviation for processing

  11. Everything as triples in RDF • Every aspect of the metadata must be expressed in RDF to be machine-processable • Metadata about real-world objects (books, people, etc.) • Metadata about the predicates (definition, label, scope, etc.) • Common predicates apply to many types of thing (human-readable label, etc.) • High-level RDF namespaces (rdfs, owl, skos) • RDF is expressed in RDF (“bootstrap”)

  12. RDF properties • Predicates are called properties in RDF • “Verbal” part of the metadata statement • E.g. “A has title ...”, “B is author of C”, “D is embodiment of E” • Properties link specific instances of two things • A = a specific book, B = a specific person, etc. • ... = a specific label, character string, annotation • => a “literal” • Properties are the links in linked data, the pathways through the Semantic Web

  13. Domains and ranges • A property can specify the types of thing it links • E.g. Bibliographic resources, Persons, Places, etc. • Types of thing are RDF classes • A domain is the class of the subject of the property • E.g. The domain of “is embodiment of” is Expression (FRBR) • A range is the class of the object of the property • E.g. The range of “is embodiment of” is Manifestation (FRBR)

  14. Inferencing • RDF enables semantic inferencing • Deducing additional, unstated triples from an existing statement or set of statements • E.g. “D is embodiment of E” + “(is embodiment of) has domain Expression” => “D is a Expression” • And “D is embodiment of E” + “(is embodiment of) has range Manifestation” => “E is a Manifestation”

  15. The truth • There is no test of veracity for a single triple in RDF • Anybody can say Anything about Anything (AAA) • Inferencing only tests for logical inconsistency • E.g. If it results in “E is a Manifestation” + “E is not a Manifestation” • Library linked data must choose and apply its properties/links with care • To maintain our reputation for reliability, quality, etc. • In a web of user-, machine-, and politically-generated metadata

  16. Thank you • To be continued ...

More Related