1 / 20

On standards bodies

Promoting Coherent Minimum Reporting Guidelines for Biological & Biomedical Investigations: The MIBBI Project Chris Taylor, EMBL-EBI & NEBC chris.taylor@ebi.ac.uk MIBBI [www.mibbi.org] HUPO Proteomics Standards Initiative [psidev.sf.net]

Download Presentation

On standards bodies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Promoting Coherent Minimum Reporting Guidelines for Biological & Biomedical Investigations: The MIBBI Project Chris Taylor, EMBL-EBI & NEBCchris.taylor@ebi.ac.uk MIBBI [www.mibbi.org] HUPO Proteomics Standards Initiative [psidev.sf.net] Research Information Network [www.rin.ac.uk]

  2. On standards bodies What defines a standards-generating body? • A beer and an airline (Zappa) • Formats, reporting guidelines, controlled vocabularies • Regular open attendance meetings, discussion lists, etc. • e.g., MGED (transcriptomics), PSI (proteomics), GSC (genomics) Hugely dependent on their respective communities • Requirements gathering (What are we doing and why?) • Development (By the people, for the people) • Testing (No it isn’t finished, but yes I’d like you to use it…) • Uptake by stakeholders • Publishers, funders, vendors, tool/database developers • The user community (capture, store, search, analyse)

  3. Modelling the biosciences Biologically-delineated views of the worldA: plant biology B: epidemiology C: microbiology …and… Generic features (‘common core’) — Description of source biomaterial — Experimental design components Technologically-delineated views of the worldA: transcriptomics B: proteomics C: metabolomics …and… MS MS Gels NMR Arrays Columns FTIR Scanning Arrays &Scanning Columns

  4. Modelling the biosciences (slightly differently)

  5. Multiple all that by three (kinds of standard)

  6. What biologists need

  7. Well-oiled cogs meshing perfectly (would be nice) “Publicly-funded research data are a public good, produced in the public interest” “Publicly-funded research data should be openly available to the maximum extent possible.” How well are things working? • Cue the Tower of Babel analogy… • Situation is improving with respect to standards • But few tools, fewer carrots (though some sticks) Why do we care about that..? • Data exchange • Comprehensibility of work • Scope for reuse (parallel or orthogonal)

  8. Investigation / Study / Assay (ISA) Infrastructure http://isatab.sourceforge.net/ Ontology of Biomedical Investigations (OBI) http://obi.sourceforge.net/ Functional Genomics Experiment (FuGE) http://fuge.sourceforge.net/ Rise of the Metaprojects

  9. Reporting guidelines — a case in point • MIAME, MIAPE, MIAPA, MIACA, MIARE, MIFACE, MISFISHIE, MIGS, MIMIx, MIQAS, MIRIAM, (MIAFGE, MIAO), My Goodness… • ‘MI’ checklists usually developed independently, by groups working within particular biological or technological domains • Difficult to obtain an overview of the full range of checklists • Tracking the evolution of single checklists is non-trivial • Checklists are inevitably partially redundant one against another • Where they overlap arbitrary decisions on wording and sub structuring make integration difficult • Significant difficulties for those who routinely combine information from multiple biological domains and technology platforms • Example: An investigation looking at the impact of toxins on a sentinel species using proteomics (‘eco-toxico-proteomics’) • What reporting standard(s) should they be using?

  10. The MIBBI Project (mibbi.org) • International collaboration between communities developing ‘Minimum Information’ (MI) checklists • Two distinct goals (Portal and Foundry) • Raise awareness of various minimum reporting specifications • Promote gradual integration of checklists • Lots of enthusiasm (drafters, users, funders, journals) • 31 projects committed (to the portal) to date, including: • MIGS, MINSEQE & MINIMESS (genomics, sequencing) • MIAME (μarrays), MIAPE (proteomics), CIMR (metabolomics) • MIGen & MIQAS (genotyping), MIARE (RNAi), MISFISHIE (in situ)

  11. Nature Biotechnol 26(8), 889–896 (2008) http://dx.doi.org/10.1038/nbt.1411

  12. The MIBBI Project (www.mibbi.org)

  13. The MIBBI Project (www.mibbi.org)

  14. The MIBBI Project (www.mibbi.org) Interaction graph for projects (line thickness & colour saturation show similarity)

  15. The MIBBI Project (www.mibbi.org)

  16. MICheckout: Supporting Users

  17. Why should I dedicate resources to providing data to others? Pro bono arguments have no impact ‘Sticks’ from funders and publishers get the bare minimum This is just a ‘make work’ scheme for bioinformaticians Bioinformaticians get a buzz out of having big databases Bioinformaticians benefitting from others’ work I don’t trust anyone else’s data — I’d rather repeat work Problems of quality, which are justified to an extent But what of people lacking resource for this, or people who want to refer to proteomics data but don’t do proteomics How on earth am I supposed to do this anyway..? Perception that there is no money to pay for this No mature free tools — Excel sheets are no good for HT Worries about vendor support, legacy systems (business models) The objections to fuller reporting

  18. Data sharing is more or less a given now, and tools are emerging Lots of sticks, but they only get the bare minimum How to get the best out of data generators? Only meaningful credit will work Need central registries of data sets that can record reuse Well-presented, detailed papers get cited more frequently The same principle should apply to data sets So, OpenIDs for people, DOIs for data? Side-benefits, challenges Would also clear up problems around paper authorship Would enable other kinds of credit (training, curation, etc.) May have to be self-policing — researchers ‘own’ their credit portfolio (though an enforcement body would also be useful) Problem of ‘micro data sets’ and legacy data Credit where credit’s due

More Related