1 / 19

Brief Notes from Kew

Brief Notes from Kew. Mark Jackson Software Applications Manager. Focussing on. Herbarium digitisation electronic Plant Information Centre. Kew Herbarium. Guesstimated 7 million specimens 250,000 types Less than 5% specimens databased A variety of personal databases.

hasana
Download Presentation

Brief Notes from Kew

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Brief Notes from Kew Mark Jackson Software Applications Manager

  2. Focussing on... • Herbarium digitisation • electronic Plant Information Centre

  3. Kew Herbarium • Guesstimated • 7 million specimens • 250,000 types • Less than 5% specimens databased • A variety of personal databases

  4. Preparation for Digitisation • Computerise transactions • Agree and document policy and procedures • Establish core fields (HISPID pending ABCD) • Develop hardware and software infrastructure (e.g. catalogue database, mass storage)

  5. Digitisation Strategy • Curators to barcode, database and image types for loan • Repatriation & research projects • to use infrastructure and core fields • data to be imported into Catalogue (eventually) • Pursue digitisation projects www.kew.org/data/repatbr

  6. Specimen imaging • Decision to try to match Cibachrome prints in terms of quality (e.g. suitable for many diagnostic purposes) • 600 dpi delivers 200MB images • Stored as uncompressed (but bzipped) TIFFs • Acquisition of mass storage

  7. HerbScan • A3 flatbed scanner, inverted • Cradle for specimens • Distributed throughout Herbarium

  8. £30-40,000 200MB images barely achievable 1 image per minute Fixed Versatile £7,500 200MB images easily achievable 10 images per hour Some mobility Suited to flat items Pros and cons 200 MB master images (600 dpi scans), based on capturing the level of detail of Cibachromes. CameraHerbScan

  9. HerbCat enquiries image enquiries Client Image Server HerbCat Images Metadata

  10. Focussing on... • Herbarium digitisation • electronic Plant Information Centre

  11. UK government funding for delivery of services electronically • Resource-discovery interface to multiple Kew data sources (not necessarily at Kew) • Data sources are heterogenous • Simple interface overlaying other systems ePIC Interface Data source Data source Data source Data source

  12. Architecture Interface (java servlet)/JSPs Requests Results Multi-threaded Java server Request queue Data sources Data sources Handlers: one per data source one for logging one for spell-checking Configuration files (XML)

  13. Texts • Web documents indexed using Lucene • Flora Zambesiaca digitised and marked-up with XML • Experimentation with options for query and output via Java servlet • using XSL to output selections • using Lucene to index the XML • importing the XML into a database • Other texts - jury still out, but Lucene route looks promising

  14. Feedback • Email mechanisms • Web usability testing/focus groups • Logging • Quantitative success • levels of usage, patterns & trends • beware: crawlers, testing & development staff, harvesters • referring URLs, Google link: popularity of site • country, domain • Qualitative success • success of queries esp. zero hits (spelling, common names, families) • performance & system monitoring • number of queries per session, return visits • results pages viewed

  15. World distribution of queries

  16. Future www.kew.org/epic • More data sources, including texts and images • Hierarchical browsing front-end based around revamped Brummitt Families & Genera with phylogenetic classification • Looking forward to • using the GBIF Names Service… • links with DiGIR/BioCASE resources...

More Related