1 / 11

Bayesian Factor Regression Models in the “Large p, Small n” Paradigm Mike West, Duke University

Bayesian Factor Regression Models in the “Large p, Small n” Paradigm Mike West, Duke University. Presented by: John Paisley Duke University. Outline. Empirical Factor Regression (SVD) Latent Factor Regression Sparse Factor Regression. Linear Regression & Empirical Factor Regression.

lexine
Download Presentation

Bayesian Factor Regression Models in the “Large p, Small n” Paradigm Mike West, Duke University

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bayesian Factor Regression Models in the “Large p, Small n” ParadigmMike West, Duke University Presented by: John Paisley Duke University

  2. Outline • Empirical Factor Regression (SVD) • Latent Factor Regression • Sparse Factor Regression

  3. Linear Regression &Empirical Factor Regression • Linear Regression • SVD Regression D is a diagonal matrix of singular values

  4. Empirical Factor Regression • By definition, • Regression is now done in factor space using generalized shrinkage (ridge regression) priors on , e.g. RVM • Problem of inversion: has many-to-one mapping • is canonical “least-norm” inverse

  5. Example: Biscuit Dough Data • NIR spectroscopy reflectance values are predictors • Response is fat content of dough samples • 39 training, 39 testing: data are pooled and testing data responses treated as missing values to be imputed • Top 16 factors used, based on size of singular values

  6. Example: Biscuit Dough Data (2) • Left: Fitted and predicted vs true values • Right: Least-norm inverse of beta • ~ 1700 nm range is absorbance region for fat • As can be seen, solution is not sparse

  7. Latent Factor Regression • Loosen to • Under proper constraints on B, this finds common structure in X and isolates idiosyncrasies to noise • Now, variation in X has less effect on y • The implied prior is  • When variance, Phi  0, this reverts to empirical linear regression

  8. Sparse Latent Factor Regression • WRT gene expression profiling, “multiple biological factors underlie patterns of gene expression variation, so latent factor approaches are natural – we imagine that latent factors reflect individual biological functions… This is a motivating context for sparse models.” • Columns of B represents the genes involved in a particular biological factor. • Rows of B represent a particular gene’s involvement across biological factors.

  9. Example: Gene Expression Data • p = 6128 genes measured using Affymetrix DNA microarrays • n = 49 breast cancer tumor samples • k = 25 factors • Factor 3 separates by red: estrogen receptor positive tumors blue: ER negative

  10. Example: Gene Expression Data • Comparison with results obtained using empirical SVD factors

  11. Conclusion • Sparse factor regression modeling is a promising framework for dimensionality reduction of predictors. • Only those factors that are relevant (e.g. factor 3) are of interest. Therefore, only those genes with non-zero values in that column of B are meaningful.

More Related