1 / 35

Persistent Identifiers

Persistent Identifiers. Solving a number of problems through a simplistic mechanism. Agenda. What are Persistent Identifiers (PIDs)? Extended PIDs How to use them. What are PIDs?. Persistent identifiers come in various formats. 10876/abc123 10.1594/WDCC/CMIP5.NCCNMpc

felix-noble
Download Presentation

Persistent Identifiers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Persistent Identifiers Solving a number of problems through a simplistic mechanism

  2. Agenda Whatare Persistent Identifiers (PIDs)? Extended PIDs Howtousethem

  3. Whatare PIDs?

  4. Persistent identifierscome in variousformats 10876/abc123 10.1594/WDCC/CMIP5.NCCNMpc ark:/13030/tf5p30086k http://purl.org/dc/elements/1.1/ urn:lsid:ubio.org:namebank:11815

  5. Why do weneed PIDs? DRS Syntax Tracking ID CIM ID ...

  6. PIDs pointtoresources 10876/abc123 Resolution service 101010101010101 http://example.com/xyz567

  7. The resourceis a black box Metadata Data ? ? Software code Document 101010101010101 ?

  8. PIDs aregloballyunique 10876/abc123 10876/abc123 101010101010101 101010101010101

  9. URLs are not persistent over time („link rot“) • Today • 2015 • 2020 http://example.com http://example.com http://example.com 404 Not found 101010101010101 101010101010101

  10. PIDs are persistent over time 10876/abc123 10876/abc123 10876/abc123 101010101010101 101010101010101 101010101010101 • Today • 2015 • 2020

  11. PIDs establish a redirection layer 10876/abc123 Stable Unstable 101010101010101 http://... http://... http://...

  12. Operations on a PID Create PID Update the URL the PID pointsto (Delete PID)

  13. Therearemany PID systems / infrastructures Handle System ArchivalResource Key (ARK) Life Science Identifier (LSID) Persistent URL (PURL) Uniform Resource Name (URN) ...

  14. How do PID infrastructuresdiffer?

  15. Extended PIDs Weneedtogobeyondthesimple redirectionview

  16. Someinformation must bestoredpersistently Checksum: 7D01E436 ! Verify... Checksum: 7D01E436 10876/A 10876/A 101010101010101 101010101110101 Today 2015

  17. Buildmorecomplexinformationstructures A: 1 B: 4 C: 3 D: 7 [1, 5, 13, 9, 12]

  18. Collectionsof PIDs arerequiredforourusecases 10876/collection1 10876/B 10876/B 10876/B 10876/B 10876/A

  19. Graphs ortreesof PIDs arerequiredaswell 10876/A 10876/B 10876/C

  20. The graphnodesandedgesmaybetyped Data Object 10876/A olderversion hasmetadata 10876/B 10876/C hasmetadata Data object Metadataobject

  21. The graphstructure must bestoredpersistently 10876/B 10876/C hasmetadata Data object Metadataobject Onecombinedentity 101010101010101 101010101010101

  22. Collectionscanberealizedthroughgraphs 10876/collection1 10876/B 10876/B 10876/A

  23. What must bestoredpersistently? Minimal metadata (key-metadata) static • Checksum • PID creation time stamp • Graph structure (links) • Collectionmembership dynamic

  24. Levels ofpreservation Minimal metadata 10876/abc123 Primary levelofpreservation Secondarylevelofpreservation 101010101010101

  25. PIDs are a topicfor international collaboration 10876/A 10876/collection1 10876/B 10876/B 10876/B 10876/B 10876/A 10876/B 10876/C • Relation types must be standardized. • Research Data Alliance • WG ‘PID Information Types’ • WG ‘Type Registry’ • collections?

  26. Usage scenario: Provenance as a DAG Data object PID Link „was derivedfrom“ t cdo

  27. Software How do weactuallyuse PIDs?

  28. I am biased towards the Handle System • Fortechnicalreasons • key-metadataisuniquefeaturethatisrequiredfor PID graphs • Forpracticalreasons – examples: • ARKs and URNs lack wideadoptionandsupport • PURL maintenanceis not clear • LSIDs in olderliteratureare not persistent • Handle System has an operational perspective

  29. Whatisthe Handle System? • Developedby CNRI • Corporation for National Research Initiatives • Registered trademark • Fee forregisteringnewprefixes (e.g. 10876) • Customers e.g. • US military • International DOI Foundation

  30. Howdoesthe Handle System work? 10876/100 100 URL: www.dkrz.de Checksum: ... 10876 Prefix DB 1234 1001 Central resolutionservice

  31. Whatare Digital Objects? http://example.com/xyz789 10876/abc123

  32. Build a stackoflightweightcomponents LAPIS API for Persistent Identifier Services (on GitHub)

  33. Further reading Weigel et al.: “A framework for extended persistent identification of scientific assets” (submitted to the Data Science Journal) Duerr et al., doi:10.1007/s12145-011-0083-6

  34. Thankyou. All slidesavailablehere: redmine.dkrz.de/seminar

  35. The greater plan LTA application: Q4 2012, Q1 2013 EUDAT integration: 2014 CMIP6+: 2014

More Related