770 likes | 941 Views
Linked Data at the National Széchényi Library : road to the publication. SWIB10 : SEMANTIC WEB IN BIBLIOTHEKEN Cologne, 29–30 November, 2010 Ádám Horváth National Széchényi Library. Contents. Why I am here Background information on NSZL Road to the publication
E N D
Linked Data at the National Széchényi Library : road to the publication SWIB10 : SEMANTIC WEB IN BIBLIOTHEKEN Cologne, 29–30 November, 2010 ÁdámHorváth National Széchényi Library
Contents • Why I am here • Background information on NSZL • Road to the publication • Current developments and future plans 2Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
The news • The National Széchényi Library (NSZL) has recently published its entire OPAC and Digital Library and the corresponding authority data as Linked Open Data. (2010.04.20) 3Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
The news • The used vocabularies are • RDFDC for bibliographic data, • FOAF for names, and • SKOS for subject terms and geographical names 4Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
The news • NSZL uses CoolURIs • Every resource has both RDF and HTML representation • Our RDFDC, FAOF and SKOS statements are linked together • Our name authority is matched with the DBPedia name files • URI aliases are handled as owl:sameAsstatements • NSZL also supports the HTML link auto-discovery 5Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
Information infrastructure of NSZL • Integrated library system • Amicus • Consortium system • Views • Oracle based • Authority handling • Do not handle all thesaurus relation types • Products module • Z39.50 server 6Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
Information infrastructure of NSZL • OPAC • LibriVision • HTML • XML and XSLT based • Z39.50 client • Session based 7Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
Information infrastructure of NSZL • Thesaurus handling • Relex • Contains general terms and geographical names • It uses all possible relation type available in ISO2788 • It also contains UDC equivalents and coordinates • Relex can produce MARC output • Relex uses the descriptors themselves as identifiers • Subject terms in Amicus is based on Relex 8Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
Information infrastructure of NSZL • MARC is HUNMARC • MARC21 based • No punctuation marc in records • More subfields • Punctuation is program generated • MARC21 tools and utilities can’t be used 9Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
Information infrastructure of NSZL • The IT department contains • Programmers • System librarians • Maintenance stuff 10Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
Information infrastructure of NSZL • The IT department is responsible for • Integrated library system • Developing digital library • Developing other utilities • Maintaining the whole IT infrastructure • Not responsible for • Digitisation • Homepage 11Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
Road to the publication 12Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
Motivation of our semantic web developments • Personal interest • Interested colleagues • Kornél Horváth (horvath.kornel@oszk.hu) • Zsolt Zachár (zachar.zsolt@oszk.hu) • Semantic web is a cool thing • My friends are also publishing their data • Our role is to provide data 13Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
Carrying out the development • There were no specific project • Small developments pointing to the same direction • We developed it when time permitted 14Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
The very first steps • ONE2 project • Interoperability project • Z39.50 • SRU emerged by the end of the project • Semantic web was also mentioned 15Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
SRU interface for Amicus • TEL-ME-MOR project • To make NSZL searchable on the TEL portal via SRU • YAZ Proxy was used • SRU/Z39.50 gateway • User defined XSLT • Result set is according to the TEL Application profile 16Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
SRU interface for Amicus • The important results • URL based search • XML result set • TELAP • MARCXML • RDFDC is provided via YAZ Proxy 17Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
Development of LibriUrl • The requirement was to • make LibriVision OpenUrl compatible • provide access to OPAC records via URL • The problem was that LibriVision is • Session based and requires login • LibriUrl • URL based search interface for the OPAC • Developed by NSZL on the bases of a vendor software code 18Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
Development of LibriUrl • LibriUrl search: http://link.oszk.hu/libriurl.php?LN=en&DB=any&SRY=an&SRE=2616972 • LibriUrl side effects • Search for Amicus number is a link to a specific record • Our records became bookmarkable and linkable and OpenSearchable 19Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
How does it work? 20Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
How does it work? 21Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
22Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
23Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
Demonstration 24Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
Demonstration 25Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
Development of LibriUrl • The importance of LibriUrl • It was behind of our CoolURI http://link.oszk.hu/libriurl.php?LN=en&DB=any&SRY=an&SRE=2616972 26Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
SKOSifying the thesaurus • TelPlus project • Thesaurus in SKOS for search refinement • SKOS conversion • SKOS is converted from the MARC output of Relex • Not every thesaurus relation type is converted • UDC and coordinates are also not included 27Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
SKOSifying the thesaurus • Concept identifier is an IRI compatible descriptor: • 150 ** a abszurd dráma • <skos:Conceptrdf:about="http://nektar.oszk.hu/auth/abszurd_dráma"> 28Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
SKOSifying the thesaurus • Serving SKOS to TelPlus • The SKOS XML file is indexed by Zebra and served via SRU • http://193.6.201.195:9996/skos?version=1.1&operation=searchRetrieve&query=%22Gravenhage%22&startRecord=1&maximumRecords=10 • It is not RDF/XML 29Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
SKOSifying the thesaurus • Importance of SKOS development • Having a conversion tool for creating SKOS records 31Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
LIBRIS as an example • ELAG, 2008, Wageningen • LIBRIS’ linked open data was presented • Content negotiation • Cool URI • Link rel 32Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
Focusing to the publication • Realised that we have almost everything to be able to publish our data as LOD • We had • SKOS • LibriUrl (accessing OPAC records via URL) • YAZ Proxy - SRU (URL based search in Amicus) • LIBRIS as an example 33Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
Focusing to the publication • What were missing • Name convention of resources • Identifiers • Content negotiations • RDFDC • RDF database • FOAF • Link rel metatags in the OPAC header • Creating links 34Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
Name convention • Resource name for documents • /resource/manifestation/2645471 • Name for RDF representation • /data/manifestation/2645471 • Name for HTML representation • /hu/manifestation/2645471 • /en/manifestation/2645471 35Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
Name convention • Resource name for authority • /resource/auth/33589 • Name for RDF representation • /data/auth/33589 • Name for HTML representation • /auth/33589 36Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
Identifiers • Documents • Amicus number (MARC 001) • Subject authority (thesaurus) • The descriptor itself with some conversion rules • Names • Special number stored in the Amicus database 37Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
Content negotiation • Implementation of content negotiation • 303 redirection was chosen 38Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
39Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
40Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
Creating RDF for catalogue recordsCreating „RDFDC” • XSLT does the job • It is MARCXML RDF/XML conversion • Modification of the MARC to TEL Application profile conversion • Creates links to subjects and names • Used vocabularies • Dublin Core • BIBO 41Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
Installation of RDF database • Jena • Joseki SPARQL endpoint 42Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
Creating FOAF for names • Batch process • The name index of Amicus is used • FOAF is stored in and served from Jena • During update always the entire FOAF dataset is rebuilt 43Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
Contents of Jena • Names (FOAF) • Subject authority (SKOS) • It is still available from Zebra via SRU • Catalogue records (RDFDC) • All of our linked data can be searched via SPARQL 44Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
Creating HTML link auto-discovery • In the head of our OPAC extended view pages <link rel="meta" type="application/rdf+xml" title="RDF Version" href="http://nektar.oszk.hu/data/manifestation/2645471" /> 45Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
Creating links • Links to NSZL resources • Link from RDFDC to names and subjects <dcterms:creator rdf:resource="http://nektar.oszk.hu/resource/auth/33589"/> <dc:creator>Jókai Mór (1825-1904)</dc:creator> 46Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
Creating links • Links to external resources • Link from the name authority records to DBpedia <foaf:Person rdf:about="http://nektar.oszk.hu/resource/auth/33589"> <foaf:name>Jókai Mór (1825-1904)</foaf:name> <owl:sameAs rdf:resource="http://dbpedia.org/resource/M%C3%B3r_J%C3%B3kai"/> 47Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
Resolving the URL of documents • The URL of the document • /resource/manifestation/2645471 • RDF is requested • Redirection to /data/manifestation/2645471 • PHP program gathers • MARCXML • Name ids • XSLT creates RDF/XML with links 48Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
Resolving the URL of documents • Link generation to SKOS • Automatic conversion • from the literal • <dcterms:subject>abszurd dráma</dcterms:subject> • to the Concept • <dcterms:subject rdf:resource="http://nektar.oszk.hu/resource/auth/abszurd_dráma"/> 49Linked Data at the National Széchényi Library- Ádám Horváth - NSZL
Resolving the URL of documents • Link generation to FOAF • The XSLT gets the name ids as parameters and creates the links <dc:creator>Jókai Mór (1825-1904)</dc:creator> <dcterms:creator rdf:resource="http://nektar.oszk.hu/resource/auth/33589"/> 50Linked Data at the National Széchényi Library- Ádám Horváth - NSZL