1 / 10

Linear Regression Analysis with a focus on Influence Diagnostics using proc reg prepared by Voytek Grus for

Linear Regression Analysis with a focus on Influence Diagnostics using proc reg prepared by Voytek Grus for. SAS user group, Halifax February 23, 2007. Introduction: What is Regression Analysis?.

conan
Download Presentation

Linear Regression Analysis with a focus on Influence Diagnostics using proc reg prepared by Voytek Grus for

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Linear Regression Analysis with a focus on Influence Diagnosticsusing proc regprepared byVoytek Grusfor SAS user group, Halifax February 23, 2007

  2. Introduction: What is Regression Analysis? • A broad collection of statistical techniques used to explore relationship between measurable variables. • It’s primary purpose is to describe the relationship between variables (model) and predict response or study its components (coefficients). • A central idea to RA is that it is a statistical (stochastic) process (not a deterministic equation) • A subgroup of Generalized Linear Models or/and Multivariate Analysis.

  3. Introduction: Types of Regression Analysis • Data types and statistical techniques • Analysis of observational versus experimental data (proc rsreg) • Discrete response variable: logistic regression (proc logistic, transreg) • Time series versus cross-sectional data (procs autoreg, pdlreg, arimax) • Survival Analysis: lifetime or failure time (proc lifereg) • Regression on random predictors • Simultaneous Econometric equations (procs model, syslin) • Structural Equation Modeling (proc calis) • Estimation techniques • Linear vs non-linear (proc nlin nlinmix) • Least square vs non-least squares such as MLE. (proc robustreg) • Least squares vs partial-least squares (proc pls) • Multivariate regression (multiple response regression)

  4. SAS offers many diverse tools to do regression analysis • A good way to start is to read about RA in SAS help. • Chapter 2 of “Introduction to Regression Procedures” gives a good overview of RA and SAS procedures available to do varies analyses. • SAS procedures, SAS Enterprise Guide, Matrix Programming language

  5. Regression Analysis: Process • State the purpose of the analysis: prediction, variable screening, model specification, parameter estimation (signs and significance), influence diagnostics. • Identify type of regression analysis to be conducted and find appropriate tools • Assess quality of your data • Fit in regression model • Examine compliance with statistical assumptions, remedy violation of where necessary, assess quality of fit. • Draw conclusions

  6. Diagnostics: testing for violation of assumptions • Analysis of residuals • Normality assumption (QQ- and PP-plots, added variable plots, partial residual plots, histograms, F tests for lack of fit, Durbin Watson) • Heteroscedasticity (ACOV and SPEC options). • Outlier detection (How large is too large?) • Influence diagnostics (cook’s distance, press) • Model specification (Levarage plots, Cp Mallow) • Non-linearity (scatter plots, partial res. Plots) • Over- and under-specfication • Multicollinearity tests (tol, vif, colin) • Autocorrelation (Durbin Watson) • Random predictors (X’s measured with errors)

  7. Remedies to violation of assumptions • Variable selection process (stepwise, mxrl etc proc reg) • Variable transformation • Dummy variables • Box-Tidwell Procedure • Not all functions are linearizable and non-linear regression must be used. • Polynomial regression (proc rsreg) • Weighted Least squares (weight statement in proc reg) • Non-least Squares Regression • Failure of normality: Huber M-estimator (proc robustreg) • Principal Components regression (proc pls princomp) • Ridge regression (proc reg) • Partial Least Squares: random predictors • Proc pls • Non-linear regression • Proc NMLX, proc nlin, proc model

  8. Functionality of Proc Reg in Linear Regression Analysis • Data modeling: by group processing, where statement, multiple model statements • Interactive analysis: reweigh, paint, plot statements etc. • Diagnostic tools: plots, tests (outliers, normality etc) Hypothesis Testing: F, t tests, partitioning of variability • Automated variable selection procedures: stepwise regression. Forward selection, backward elimination, maxr. • Model validation: Mallow Cp graphs. • Prediction: prediction intervals, press residuals etc.

  9. Literature • Classical and Modern Regression with Applications Raymond H. Myers (1986) • Applied Linear Regression by Sanford Weisberg ( 1985) • SAS Help Examples

  10. Questions?

More Related