1 / 15

• Pseudo Inverse • “Heisenberg Uncertainty” for Data Mining • Explicit Principal Components

Principal Component Regression Analysis. • Pseudo Inverse • “Heisenberg Uncertainty” for Data Mining • Explicit Principal Components • Implicit Principal Components • NIPALS Algorithm for Eigenvalues and Eigenvectors • Scripts - PCA transformation of data - Pharma-plots

tracey
Download Presentation

• Pseudo Inverse • “Heisenberg Uncertainty” for Data Mining • Explicit Principal Components

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Principal Component Regression Analysis • Pseudo Inverse • “Heisenberg Uncertainty” for Data Mining • Explicit Principal Components • Implicit Principal Components • NIPALS Algorithm for Eigenvalues and Eigenvectors • Scripts - PCA transformation of data - Pharma-plots - PCA training and testing - Bootstrap PCA - NIPALS and other PCA algorithms • Examples • Feature selection

  2. ( ) - 1 T T X X X mn nm mn Classical Regression Analysis Pseudo inverse Penrose inverse Least-Squares Optimization

  3. ( ) - 1 T X X The Machine Learning Paradox If data are can learned from, they must have redundancy If there is redundancy, (XTX)-1 is ill-conditioned - similar data patterns - closely correlated descriptive features

  4. » X T B nm nh hm = T T X B nh nm mh Beyond Regression • Paul Werbos motivated beyond regression in 1972 • In addition, there are related statistical “duals” (PCA, PLS, SVM) • Principal component analysis: h = # Principal components • Trick: eliminate poor conditioning by using h PC’s (largest ) • Now matrix to invert is small and well-conditioned • Generally include ~ 2 - 4 -6 PCAs • A Better PCA Regression is PLS (Please Listen to Savanti Wold) • A Better PLS is nonlinear PNLS

  5. » X T B nm nh hm = T T X B nh nm mh Explicit PCA Regression • We had • Assume we derive PCA features for A according to • We now have h = # Principal components

  6. Explicit PCA Regression on training/test set • We have for training set: • And for the test set:

  7. » X T B nm nh hm = T T X B nh nm mh Implicit PCA Regression h = # Principal components How to apply? Calculate T and B with NIPALS algorithm Determine b, and apply to data matrix

  8. » X T B nm nh hm = T T X B nh nm mh Algorithm h = # Principal components • The B matrix is a matrix of eigenvectors of the correlation matrix C • If the features are zero centered we have: • We only consider the h eigenvectors corresponding to largest eigenvalues • The eigenvalues are the variances • Eigenvectors are normalized to 1 and solutions of: • Use NIPALS algorithm to build up B and T

  9. » X T B nm nh hm = T T X B nh nm mh NIPALS Algorithm: Part 2 h = # Principal components

  10. PRACTICAL TIPS FOR PCA • NIPALS algorithm assumes the features are zero centered • It is standard practice to do a Mahalanobis scaling of the data • PCA regression does not consider the response data • The t’s are called the scores • Use 3-10 PCA’s • I usually use 4 PCA’s • It is common practice to drop 4 sigma outlier features • (if there are many features)

  11. PCA with Analyze • Several options: option #17 for training and #18 for testing • (the weight vectors after training is in file bbmatrixx.txt) • The file num_eg.txt contains a number equal to # PCAs • Option –17 is the NIPALS algorithm and generally faster than 17 • SAnalyze has options for calculating T’s, B’s and ’s • - option #36 transforms a data matrix to it’s PCAs • - option #36 also saves eigenvalues and eigenvectors of XTX • Analyze has also option for bootstrap PCA (-33)

  12. StripMiner Scripts • last lecture: iris_pca.bat (make PCAs and visualize) • iris.bat (split up data in training and validation set and predict) • iris_boot.bat (bootstrap prediction)

  13. Bootstrap Prediction (iris_boo.bat) • Make different models for training set • Predict Test set on average model

  14. Neural Network Interpretation of PCA

  15. PCA in DATA SPACE Means that the similarity score with each data point will be weighed (i.e.., effectively incorporating Mahalanobis scaling in data space) Σ Σ x1 This layer gives a similarity score with each datapoint Σ Σ . . . Σ Σ Σ xi Σ Σ Kind of a nearest neighbor weighted prediction score xM Σ Weights correspond to H eigenvectors corresponding to largest eigenvalues of XTX Weights correspond to the dependent variable for the entire training data Σ Weights correspond to the scores or PCAs for the entire training set

More Related