1 / 16

TRD 2 Update: An annotation scheme to foster reproducible NMR data analysis

TRD 2 Update: An annotation scheme to foster reproducible NMR data analysis. Matt Fenwick, Eldon Ulrich, Michael Gryk. Overview of NMR spectral analysis. peak-picking: distinguishing S from N true positives from false resonance assignment NOESY peak assignment semi-automated

elmer
Download Presentation

TRD 2 Update: An annotation scheme to foster reproducible NMR data analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TRD 2 Update: An annotation scheme to foster reproducible NMR data analysis Matt Fenwick, Eldon Ulrich, Michael Gryk

  2. Overview of NMR spectral analysis peak-picking: distinguishing S from N true positives from false resonance assignment NOESY peak assignment semi-automated - software tools - human intervention required human uses deductive process of reasoning - small set of rules/expectations (library) - deductions may be logically dependent on each other L10 + A5 +

  3. Problem: Missing Data -> Irreproducible Much intermediate data is not saved / deposited - step order - logical dependencies - deductive reasoning - peculiarities found and their resolutions (unexpected, missing, extra peaks) final data - resonances, spin systems - extraneous data -- contaminants, noise, artifacts, anomalies ...

  4. Missing Data: Spin Systems & Resonances NMR experiments are designed to exploit networks of coupled spins (spin systems). The assignment process is 2-step: (1) assign resonances to spin systems, (2) assign spin-systems to residues Resonance and spin-systems are not deposited. Images are from Protein NMR: A Practical Guide (http://www.protein-nmr.org.uk/)

  5. Solution 1. capture process of reasoning - version control: capture intermediate states - model of commonly used deductive reasons - annotate changeset with deductive reasons 2. capture complete final data set - model for identifying problems - model for extraneous data - deposit full results

  6. 1. version control -- snapshots, commit message snapshots of intermediate states: enables backtracking, inspecting of past states describe difference between consecutive snapshots; summary, purpose, justification, questions, uncertainties

  7. 1. model of NMR deductive reasoning start with CCPN data model augment with library of common deductive reasons use deductive reasons to annotate commits

  8. 2. model: identify problems (distinguishing signal from noise; true positives, false positives, false negatives) facilitates re-interpretation, if additional data is collected, by pointing out trouble spots unassigned signal peak missing CB peaks of Gln sidechain

  9. 2. extraneous data, full results collaborate with BMRB: deposit full data sets extend NMR-Star data dictionary extend Sparky assignment program noise & artifact peaks, unassigned spin systems, contaminants, anomalies, ...

  10. Review: Solution 1. process of reasoning - version control: capture intermediate states - model of commonly used deductive reasons - annotate changeset with deductive reasons 2. final data - model for identifying problems - model for extraneous data - deposit full results

  11. Challenges? - human/computer optimization - simple enough for users to apply properly, vs. detailed enough that a program can understand complete context of an annotation - separate layers: use more/less detail as needed - (future) tools can increase level of detail without bogging humans down - future compatibility - library of annotations provides “guidance”; extensions can be trivially added by augmenting library - if there’s a problem with the library of annotations, can fix by extending (providing a new, similar annotation) - tooling - Sparky

  12. Annotation Mock up (STAR-like format) loop_ # spin-system/amino-acid-type assignment _SSAA_Assn.ID _SSAA_Assn.SS_ID _SSAA_Assn.AA_ID ... ... 101 52 Alanine stop_ loop_ # peak/spin-system assignment _Peak_SS_Assn.ID _Peak_SS_Assn.SS_ID _Peak_SS_Assn.Peak_ID _Peak_SS_Assn.Peak_Spectrum ... ... 175 52 124 HNCACB 176 52 125 HNCACB 177 52 126 HNCACB 178 52 127 HNCACB stop_ save_ data_example save_assign loop_ # tags _Tag.ID _Tag.Parent_ID ... ... 24 23 stop_ loop_ # reasons used _Tag_Reason.ID _Tag_Reason.Tag_ID _Tag_Reasons.Name ... ... 73 24 "BMRB statistics" 74 24 "chemical shift grouping" stop_

  13. Impact - reproducibility - error detection - error correction - collaboration - sharing - learning - analysis quality - amenability to future analysis

  14. Appendix: NMR phenomena: grouping resonances based on chemical shift

  15. Appendix: extraneous data: processing artifacts, spurious peaks

  16. Appendix: Library examples Asn sidechain Ala backbone sequential spin systems

More Related