1 / 36

Techniques for studying correlation and covariance structure

Techniques for studying correlation and covariance structure. Principal Components Analysis (PCA) Factor Analysis. Principal Component Analysis. Let. have a p -variate Normal distribution. with mean vector. Then. where. are eigenvectors of S of length 1 and. are eigenvalues of S.

cree
Download Presentation

Techniques for studying correlation and covariance structure

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Techniques for studying correlation and covariance structure Principal Components Analysis (PCA) Factor Analysis

  2. Principal Component Analysis

  3. Let have a p-variate Normal distribution with mean vector Then where are eigenvectors of S of length 1 and are eigenvalues of S.

  4. The Principal Components are defined by or and have covariance matrix are independent with Var(Cj) = lj

  5. Many times for large value of j,Var(Cj) = lj, is small and contributes little to the total variance In this case the number of variables can be reduced to the small number of principal components. In regression analysis it is sometimes useful to transform the independent variables into their principal components

  6. Scree Plot Proportion of variance Principal Components

  7. Scree Plot Cumulative Proportion of variance Principal Components

  8. Example In this example wildlife (moose) population density was measured over time (once a year) in three areas.

  9. picture Area 3 Area 2 Area 1

  10. The Sample Statistics The mean vector The covariance matrix The correlation matrix

  11. Principal component Analysis The eigenvalues of S The eigenvectors of S The principal components

  12. The Example

  13. Scree Plots

  14. More Examples

  15. Recall: Computation of the eigenvalues and eigenvectors of S

  16. continuing we see that: For large values of n

  17. The algorithm for computing the eigenvector • Compute rescaling so that the elements do not become to large in value. i.e. rescale so that the largest element is 1. • Compute using the fact that: • Compute l1 using

  18. Repeat using the matrix • Continue with i = 2 , … , p – 1 using the matrix Example – Using Excel - Eigen

  19. Factor Analysis An Alternative technique for studying correlation and covariance structure

  20. Let have a p-variate Normal distribution with mean vector The Factor Analysis Model: Let F1, F2, … , Fk denote independent standard normal observations (the Factors) Let e1, e2, … , ep denote independent normal random variables with mean 0 and var(ei) = yp Suppose that there exists constants lij (the loadings) such that: x1= l11F1+ l12F2+ … + l1kFk + e1 x2= l21F1+ l22F2+ … + l2kFk + e2 … xp= lp1F1+ lp2F2+ … + lpkFk + ep

  21. Using matrix notation where and with

  22. Note: hence and i.e. the component of variance of xi that is due to the common factors F1, F2, … , Fk. i.e. the component of variance of xi that is specific only to that observation.

  23. Determine cov(xi,Fj) Recall

  24. Also where Thus Now , if also then lij is the correlation between xi and Fj.

  25. Rotating Factors Recall the factor Analysis model This gives rise to the vector having covariance matrix: Let P be any orthogonal matrix, then and

  26. Hence if with is a Factor Analysis model then so also is with where P is any orthogonal matrix.

  27. The process of exploring other models through orthogonal transformations of the factors is called rotating the factors There are many techniques for rotating the factors • VARIMAX • Quartimax • Equimax VARIMAX rotation attempts to have each individual variables load high on a subset of the factors

  28. Example: Olympic decathlon Scores Data was collected for n = 160 starts (139 athletes) for the ten decathlon events (100-m run, Long Jump, Shot Put, High Jump, 400-m run, 110-m hurdles, Discus, Pole Vault, Javelin, 1500-m run). The sample correlation matrix is given on the next slide

  29. Correlation Matrix

  30. Identification of the factors

More Related