1 / 15

Data Analysis Summary

This summary discusses the challenges and opportunities in data analysis for medical sequencing and other "-omics" in clinical settings. It highlights the need for reproducible analytical methods, versioning of data and analysis pipelines, and the establishment of clinically actionable variants. The importance of data sharing and the need for standards and protocols are also emphasized.

candicep
Download Presentation

Data Analysis Summary

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Analysis Summary

  2. Elephant in the room

  3. General Comments • General understanding that informatics is integral in medical sequencing and other –omics in clinical settings • About 80% of attendees were actively involved in data analysis • Clinical practitioners also present • Talking about data analysis is difficult • We do not yet have language to do so • Complicated • Not clear what details are important

  4. Overview • Analytical Validity • Enhance Clinical Utility • Data Sharing • Messages to NCI

  5. Analytical Validity • Definition of “mutation”, both at the level of variant calling and with “established” calls, remains unclear • Reproducible analytical methods in both research and clinical practice needed • Versioning of raw data and processed data • Versioning of data analysis pipelines • Versioning of auxiliary data (gene models, sequences, etc.)

  6. Analytical Validity • Understanding that algorithms that use all the information (tumor/normal or multiple tumor samples) yields higher sensitivity and specificity • Data archiving and sharing becomes important • Archiving blocks is not enough—best to archive data as well • Need to provide confidence associated with variants since no test is 100% accurate

  7. Analytical Validity • “Reference” genome • A computational reference is important to allow communication of findings • Lack of adequate ground-truth datasets precludes rigorous evaluation of analysis, particularly in quantifying false-negative rate • Tumor heterogeneity, tissue heterogeneity, and even stochastic sampling at the sample level remain challenges in establishing analytical validity

  8. Enhance Clinical Utilty • Lacking large (10s of thousands of patients), well-annotated databases of normal or disease patients • Definition of clinically actionable remains unclear • Reference database of clinically actionable variants • Does not exist • Will be challenging to update and maintain • Incorporating clinical context is difficult but probably necessary if one is to truly achieve precision medicine

  9. Enhance Clinical Utility • Establish methods of reporting that empower the clinician • Enough detail to be helpful • Not so much detail as to be unintelligible • Integrate with online databases and knowledge

  10. Data Sharing • Need to establish standards for data sharing in both research and clinical venues (think Myriad and BRCA1 testing) • Protocols, both computational and laboratory • Controlled vocabularies • Clinical data • The data themselves • Consider incentivizing data sharing • Pay-to-play sharing • NCI mandate

  11. Data Sharing • What constitutes de-identified data? • Need to respect rights of patients, including protecting AND sharing data • Need some way to feed clinical information back to into the informatics pipeline • Clinicians need actionable information with as much interpretation with regard to literature and knowledgebase as possible

  12. Messages to NCI • Critical need to establish ground truth datasets and biologics • Fix TCGA! • NCI should collect and maintain knowledgebase of “clinically actionable” information (variants, genes, pathways) • Start by collecting and updating lists from large medical centers • Enhance PDQ database to include computable information on molecular targets under study

  13. Messages to NCI • More input is needed when NCI is planning bioinformatics, computational biology, and biomedical informatics • Granting mechanism? • Less top-down approach to informatic • Establish and ENFORCE rational data sharing mechanisms for NCI-sponsored clinical trials • SRA is not the answer….

  14. Patient and Population Characteristics Gene Expression Gene Copy Number Transcriptional Regulation phenotype DNA Methylation Chromatin Structure and Function Sequence Variation

  15. Questions

More Related