1 / 13

Biodiversity literature mark-up Compelling use cases for Natural History Collections

Biodiversity literature mark-up Compelling use cases for Natural History Collections. Dr Dimitris Koureas Natural History Museum London. DimitrisKoureas. Workshop on mark-up of biodiversity literature Berlin 10-11 February 2014. Introduction. Significant research effort has been invested

donkor
Download Presentation

Biodiversity literature mark-up Compelling use cases for Natural History Collections

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Biodiversity literature mark-up Compelling use cases for Natural History Collections Dr Dimitris Koureas Natural History Museum London DimitrisKoureas Workshop on mark-up of biodiversity literature Berlin 10-11 February 2014

  2. Introduction Significant research effort has been invested Literature markup could have industry-wide applications with significant impact but… Who are the current stakeholders? Support from Societal actors? What are the direct societal benefits? SO… We need to demonstrate compelling use cases that will engage stakeholders Natural History Museums can be key players Use case 2: Measuring the impact of collections Use case 1: Assisted label transcription > 260 million specimens

  3. Use case 1: Assisted label transcription Legacy literature markup of specimen records can facilitate label transcription process • Digital NH Museums • Digital is NH museums strategic decision • Challenge 1 of in the Science strategy of NHM • Collection digitisation is prioritised in all major museums • NHM allocated c. £750k for the next three years (not including capital expenditure) • Label transcription is important but challenging

  4. Use case 1: Assisted label transcription Different approaches for label transcription Manual transcription of label elements OCR/markup (semi-) automatic curators crowdsourcing Manual transcription of semantic units in the label Hybrid models are currently in use

  5. Use case 1: Assisted label transcription Current approaches for label transcription Suitable label for OCR and markup Not suitable label for OCR vs High resolution Typewritten Well defined structure and semantic units Low resolution Handwritten No proper structure

  6. Use case 1: Assisted label transcription Current approaches for label transcription In-house We can enhance current approaches by introducing Literature assisted transcription Manual or semi-automated Slow and cost ineffective Not suitable for large collections Crowdsourcing Use literature markup to identify specimen records and match against the physical object Unpredictable outcome Data cleaning needed

  7. Use case 1: Assisted label transcription Label transcription: Don’t do the job twice! Most labels have already been transcribed in taxonomic literature Catalogue number Published in 2012 ATHU 3638 Basic OCR output a 1/Li I ) vi5 5, {L I‘O SPXFS \9.E " ‘: 3P~‘’‘fl\ % A HERB. ORPHANIDEUM. 3‘_‘w:a 3 PummI“lift u’ f9 ‘ A ‘-*’ /1i . _ I -}Z_,,_‘;_’:£€ Cg‘?! ~ <‘:.g‘{x Create a link between specimen and literature

  8. Use case 1: Assisted label transcription Label transcription: Don’t do the job twice! Most labels have already been transcribed in taxonomic literature Literature assisted transcription Transcription of specimen labels Is being crowdsourced for the last 250 years Minimum need of data cleaning Specimen data from small collections around the world Specimens labels transcribed several times

  9. Use case 2: measuring NH collections impact Natural History Collections Value through utilisation Value in itself Data extraction Establishing through measuring the scientific and Societal impact of collections preservation Digitisation curation McAlpine (1986): 12.7% of papers used collections & 44.4% made collections Openness Traditional activities of repositories

  10. Use case 2: measuring NH collections impact Specimen metadata born digital literature Collection assessment Specimen identifiers Legacy literature markup specimen citation metrics webservice

  11. Use case 2: measuring NH collections impact Tracking specimen citations in literature can highlight important collections Promote the value of smaller repositories Steer digitisation efforts Help in collection gap analysis Attract more funding

  12. Use case 2: measuring NH collections impact Some concerns: The use of persistent identifiers would help NH collection curators to track the scientific impact of their collections but Tracking specimen records in literature means tracking references to physical objects DOIs could be the easiest way BUT we cannot assign DOIs to physical objects unless museums quickly proceed in creating comprehensive collection data portals and assign UI to all records

  13. Biodiversity literature mark-up: Beyond taxonomic names Compelling use cases for Natural History Collections Thank you Workshop on mark-up of biodiversity literature Berlin 10-11 February 2014 @DimitrisKoureas

More Related