260 likes | 361 Views
Capturing Botanical Descriptions for Taxonomy. Prometheus I Taxonomic Database POET OODB Jessie Kennedy, Cédric Rageneaud Mark Watson, Martin Pullan, Mark Newman, Peter Barclay. Prometheus Ir Oracle RDB Gordon Russell Alan Cumming. Prometheus II (with Character Descriptions)
 
                
                E N D
Capturing Botanical Descriptions for Taxonomy Prometheus I Taxonomic Database POET OODB Jessie Kennedy, Cédric Rageneaud Mark Watson, Martin Pullan, Mark Newman, Peter Barclay Prometheus Ir Oracle RDB Gordon Russell Alan Cumming Prometheus II (with Character Descriptions) Oracle RDB Sarah McDonald, Kate Armstrong, Trevor Paterson, Alan Cannon Napier University School of Computing
Capturing Botanical Descriptions for Taxonomy Napier University School of Computing Classical Linnaean Taxonomy • Divide organisms into an hierarchical classification • Based on shared ‘characteristics’ Taxonomic Characters • Classical: Morphology, Lifestyle, Habitflower structure, leaf shape, sexual mechanisms, fruit type • Modern: palaeontology, genetics/DNA, biochemistrygenetic distance, emzymology, evolutionary relationships
Capturing Botanical Descriptions for Taxonomy Napier University School of Computing • Collect specimens for the group/taxon of interest • Perform an initial inspection of specimens and any previous descriptions and classifications • Decide ‘characters’ that will be interesting or useful to segregate specimens into sub taxa • Score ‘characters’ for each specimen on a paper proforma • Use shared ‘characters’ to sort specimens into groups (taxa) e.g. species, genus, family The Taxonomic (Revision) Process
Capturing Botanical Descriptions for Taxonomy Napier University School of Computing Some Problems with Taxonomy • Labour intensive • ‘Characters’ are poorly defined • A taxon revision is often the work of one individual and highly idiosyncratic • Only characters of interest (to this revision) are recorded • Raw data (the proforma) is often discarded • Character data is not easily compared between proforma sets - as definitions are not captured
Capturing Botanical Descriptions for Taxonomy Napier University School of Computing Proposals to ‘Improve’ Taxonomy • Use standardized, defined ‘terms’ to record character descriptions • Encourage the scoring of ‘quantitative characters’ (discourage ‘qualitative characters’) • Store description data in electronic/database form • Facilitate meaningful comparisons between character descriptions • Store character descriptions within a taxonomic database allowing taxon descriptions to be retrieved/generated
Capturing Botanical Descriptions for Taxonomy Napier University School of Computing Approach • Model the process of data capture as character descriptions • Develop an ‘ontology’ of ‘defined terms’ to use in character descriptions • Provide a database and interface for creating the ontology (terms, definitions, relationships between terms) • Extend the Prometheus hierarchical taxonomic database and interface for recording specimen character descriptions
Capturing Botanical Descriptions for Taxonomy Napier University School of Computing The Prometheus II Character model What are Characters? DescriptionElements • Quantitative DE angle, diameter, length, width, density, height, number (count) (ii)Qualitative DE lifecycle, habit, orientation, shape, symmetry, texture, sex, colour, etc.
Capturing Botanical Descriptions for Taxonomy Structure Term Property Term Value Unit State Term Frequency Modifier DESCRIPTION ELEMENT source destination Modifier Term Value Unit Statement RELATIVEMODIFIER Napier University School of Computing Specimen Description DESCRIPTION Taxon Description
Capturing Botanical Descriptions for Taxonomy DESCRIPTION UNIT DESCRIPTION ELEMENT Napier University School of Computing • Description Units may be a useful collection container for sets of DEs recording data about a particular structure • DUs might not be necessary if we have a rich structural ontology • DUs may provide a useful mechanism for duplicating instances of a structure DESCRIPTION
Capturing Botanical Descriptions for Taxonomy Napier University School of Computing (i) Quantitative Characters (i) Qualitative Characters
Capturing Botanical Descriptions for Taxonomy Napier University School of Computing • Need to define quantitative propertieslength of leaf - from apex to petiole • Need to capture ranges of values leaf length 20 to 50 mm, or less than 20mm • Need to relate DEs to other DEsleaf length is twice leaf width • Need to locate structuresthe hairs on the upper surface of a leaf at the apex • Need to represent multiple instances of a structure on the same specimenleaf (1) green and hairy, leaf (2) yellow and waxy COMPLICATIONS !
Capturing Botanical Descriptions for Taxonomy Napier University School of Computing The Prometheus II Character model Provide Consistency and Comparability.............. Defined Terms • A Defined Term might be a.. • STRUCTURE TERMleaf, hair, apex • PROPERTY TERMtexture • STATE TERMglabrous • MODIFIER TERMbefore, more than,.. • UNIT TERM • A Defined Term has a...... • TERMleaf • DEFINITIONbig flappy thing • AUTHORKennedy • CITATION‘Oor Wullie Annual’ 2003 • (ID in Database)
Capturing Botanical Descriptions for Taxonomy STATE has belongs to STRUCTURE has implied EXCLUSIVE STATE SET has has implied is a part of PROPERTY Structure Subtype Napier University School of Computing An Ontology of Defined Terms and Relations
Capturing Botanical Descriptions for Taxonomy 1 1 2 3 4 5 6 7 2 3 A B D 4 E 6 5 C D 7 E Napier University School of Computing Ontology: Structural Hierarchy 1. Define PartOf Relations B po A C po A E po A D po B E po B D po C 3. Nodes define Unique Paths A BA DBA EBA CA DCA EA
Capturing Botanical Descriptions for Taxonomy Napier University School of Computing Ontology: Structural Hierarchy • All possible Part_Of relations should be represented • Some small Structures can appear on so many other structures that it is useful to consider them as Generic Structures and exclude them from the ontology hierarchy (e.g. hairs, glands, stomata) • Regions are also excluded from the ontology, as they could be added anywhere (e.g. apex, base, margin, surface)
Capturing Botanical Descriptions for Taxonomy Napier University School of Computing TYPES IN THE ONTOLOGY • Botanists and Taxonomists frequently refer to structures as Types_Of another structure • e.g. berries and capsules are types of fruit • the types share all identifying features of the supertype • but can be distinguished by possession of a collection of states that are always true • and might have a restricted set of potential subparts • e.g. berries always soft and fleshy: capsules dry and dehiscent • Not clear how we can easily accommodate Types_Of in the Part_Of hierarchy, but potentially very useful for generalizing queries
Capturing Botanical Descriptions for Taxonomy Napier University School of Computing PROBLEMS WITH TYPES IN THE ONTOLOGY • should a subtype inherit all of its supertype’s potential Part_Of relations, or can these be restricted, and can it participate in type specific Part_Of relations ? • should alternate types be exclusive ? • can you have type hierarchies ? • e.g. bract is a type of leaf, and is part of an inflorescence • leaf is not included in the ontology as potentially part of inflorescence • the taxonomists can readily identify a bract as a type of leaf, but do not conceptualize a leaf as a potential part of an inflorescence • can we expect taxonomists to be ontologically rigorous
Capturing Botanical Descriptions for Taxonomy DEALING WITH TYPES IN THE ONTOLOGY Allow everything – create a rich type hierarchy, and allow types to behave identically to structures in the part_of hierarchy Allow a rich type_of Hierarchy, separate from the part_of hierarchy Napier University School of Computing RESULT: Horribly complicated network to handle in a relational database! Likely to be Ontologically inconsistent RESULT – very useful for query expansions, e.g. look up instances of a type and all its supertypes Allow limited rigorously defined types: • types form a set of exclusive alternatives • types represent the supertype structure plus more than one state that is always true • a type must be substitutable by its supertype
Capturing Botanical Descriptions for Taxonomy Napier University School of Computing PROFORMA ONTOLOGY • for describing a set of specimens • include all the defined terms necessary to describe the ‘characters’ of interest • created from the parent ontology • a subset of the Structuresand State Terms • expanded by instantiating Regions and Generic Structures • the paths of the nodes inherited from the parent ontology are the same • but now need to be able to describe the paths of generic structures and regions relative to these
Capturing Botanical Descriptions for Taxonomy ONTOLOGY PROFORMA ONTOLOGY Regions 1 2 3 8 A B D F centre 4 9 1 2 3 E H hair A B D 6 5 10 apex spine 4 I C D E base hair 11 7 12 13 6 5 J K E L lower surface hair C D 13 L 7 14 15 hair upper surface E K E Generic Structures Napier University School of Computing
Capturing Botanical Descriptions for Taxonomy Napier University School of Computing DUPLICATED STRUCTURES IN THE PROFORMA ONTOLOGY • A specimen description would frequently need to describe different sets and instances of a structure • e.g. several different ‘types of ‘ (sic) leaf • e.g. if it is clear that the basal leaves are different from the apical leaves • these leaves are ontologically identical, but need to be distinguished in the proforma so that their features are recorded independently • furthermore the taxonomist might want to score multiple instances of a structure if they have a range of values (e.g. length)
Capturing Botanical Descriptions for Taxonomy 1 2 3 A B D ‘Leaf’ #1 4 E 2 3 B D ‘Leaf’ #2 4 E 6 5 C D 7 13 E L Napier University School of Computing CLONED STRUCTURES • The path of the Leafstructures B, D and Ein the proforma ontology has to include its ‘clone’ identity. • When actually scoring specimens we might want to record data for multiple instances of each Leaf. • Description Units could be one mechanism to allow this.