1 / 13

Reanalysis of Petricoin et al. Ovarian Cancer Data Set 3

Reanalysis of Petricoin et al. Ovarian Cancer Data Set 3. Russ Wolfinger and Geoff Mann SAS Institute Inc. NISS Proteomics Workshop March 6, 2003. Ovarian Cancer Mass Spec Data from http://clinicalproteomics.steem.com. 91 Normals 162 Cancers. What We’d Love to See. What We Are Seeing.

perrin
Download Presentation

Reanalysis of Petricoin et al. Ovarian Cancer Data Set 3

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reanalysis of Petricoin et al. Ovarian Cancer Data Set 3 Russ Wolfinger and Geoff Mann SAS Institute Inc. NISS Proteomics Workshop March 6, 2003

  2. Ovarian Cancer Mass Spec Data from http://clinicalproteomics.steem.com 91 Normals 162 Cancers

  3. What We’d Love to See

  4. What We Are Seeing Green: Cancer, Red: Normal Left: Green in Front, Right: Red in Front

  5. New Paper from MD Anderson Baggerly, K.A, Morris, J.S., and Coombes, K.R. (2003). Cautions about Reproducibility in Mass Spectrometry Patterns: Joint Analysis of Several Proteomic Data Sets Email: kabagg@mdanderson.org • Reanalyses of all three ovarian cancer data sets • For data set 3, they note that two pairs of m/z values provide perfect discrimination: 435.46 & 465.57, and 2.79 & 245.2. Easy to find with simple t-tests; genetic algorithm unnecessary.

  6. First Pair: 435.46 and 465.57 Da Green: Cancer, Red: Normal Left: Green in Front, Right: Red in Front

  7. Second Pair: 2.79 and 245.2 Da Green: Cancer, Red: Normal Left: Green in Front, Right: Red in Front

  8. Questions • What’s going on here? • Are discriminators <500 Da generalizable? • How about >500 Da?

  9. Going Small: 435 Da At least 100 peptide fragments (including permutations) add up to 435, e.g. AFY, SMY, PPW, KNH, GGGAC, SSGGG 30 Hits from ChemFinder.com, including Sphingosyl-phosphocholine, a lipid molecule Similar kind of story for 465 Da

  10. Going Large: Cross-Validated Stepwise Discriminant Analysis • Subtract baselines and determine 330 most prominent peak areas, all with m/z > 600. • Form 500 random partitions of the 253 spectra, with a 33% stratified holdout sample in each. • Perform stepwise discriminant analysis on each partition, using entry p = 0.05, exit p = 0.20, and max variables = 5. • Compute misclassification rate on each trial.

  11. Results of Cross-Validated Stepwise Discriminant Analysis • Always picked 5 variables • Misclassification rate = 5%. • Most common discriminators: • 681, appeared in 100% of selected quintuples • 7379, in 63% • 869, in 54% • 4004, in 44%

  12. Partial Least Squares on the Same 330 Peak Areas

  13. Parting Shots • Statistical discrimination is relatively easy for these data, but what are the real explanations for the clear differences in data set 3? • Can statisticians overcome their biases and win the day? • Is this kind of approach a red herring or a red snapper?

More Related