310 likes | 451 Views
Life after linear regression. A survey of Penn State applied statistics graduate courses. The courses. Stat 500: Applied Statistics Stat 501: Regression Methods Stat 502: Analysis of Variance & Design of Expts Stat 503: Design of Experiments Stat 504: Analysis of Discrete Data
E N D
Life after linear regression A survey of Penn State applied statistics graduate courses
The courses • Stat 500: Applied Statistics • Stat 501: Regression Methods • Stat 502: Analysis of Variance & Design of Expts • Stat 503: Design of Experiments • Stat 504: Analysis of Discrete Data • Stat 505: Applied Multivariate Statistical Analysis • Stat 506: Sampling Theory and Methods • Stat 509: Biostatistical Methods • Stat 510: Applied Time Series Analysis
Stat 500: Applied Statistics • Topics covered: • Descriptive statistics • Hypothesis testing and power • Estimation and confidence intervals • Regression • One- and two-way ANOVA • Chi-square tests • Prerequisites • 2 credits of algebra
Stat 501: Regression Methods • Topics covered: • Analysis of research data through simple and multiple regression and correlation • Polynomial models • Indicator variables • Stepwise and piecewise regression • Logistic regression • Prerequisites • 6 credits of statistics or Stat 500; matrix algebra
Stat 502: Analysis of Variance and Design of Experiments • Analysis of data when: • the response y is continuous • the predictors (called factors or treatments) are all qualitative • have same error assumptions as for regression • Do the means differ among the groups defined by the factor combinations?
Stat 502: Analysis of Variance and Design of Experiments • Topics covered: • Analysis of variance and design concepts • Factorial, nested and unbalanced data • Analysis of covariance • Blocked designs • Latin-square, split-plot, repeated measures designs • Multiple comparisons • Prerequisites • Stat 501 (or undergraduate version Stat 462)
A Stat 502 Example:Intertidal Seaweed Grazers • To study influence of ocean grazers on regeneration rates of seaweed in intertidal zone, a researcher scraped square rock plots free of seaweed and observed the seaweed regeneration when certain types of seaweed-grazing animals were denied access. • Research questions: • Which grazer consumes most seaweed? • Do different grazers influence impact of each other? • Are grazing effects similar in all microhabitats?
A Stat 502 Example:Intertidal Seaweed Grazers • The grazers were limpets (L), small fishes (f), and large fishes (F): • LfF: all three grazers were allowed access • fF: limpets were excluded using caustic paint • Lf: large fish were excluded using coarse net • f: limpets and large fish were excluded • L: small, large fish excluded using fine net • C: the control group, all excluded
A Stat 502 Example:Intertidal Seaweed Grazers • Intertidal zone is a highly variable environment. Researcher applied treatments in 8 blocks of 12 plots each: • #1: Just below high tide, exposed to heavy surf • #2: Just below high tide, protected from surf • #3: Midtide, exposed • #4: Midtide, protected • #5: Just above low tide level, exposed • #6: Just above low tide level, protected • #7: On near-vertical rock wall, midtide, protected • #8: On near-vertical rock wall, above low tide, protected
A Stat 502 Example:Percent of regenerated seaweed on intertidal plots with some grazers excluded
Stat 503: Design of Experiments • The key word is “experiments” • When you can control the values of your predictors (factors), you should ensure you can answer your research question by: • Collecting the appropriate measurements • Setting the values of your factors appropriately • Reducing extraneous variation by “blocking” • Having an appropriate sample size
Stat 503: Design of Experiments • Topics covered: • Design principles • Optimality • Confounding in split-plot designs • Repeated measures designs, fractional factorial designs, response surface designs • Balanced/partially balanced incomplete block designs • Prerequisites: • Stat 501 (or undergraduate Stat 462) • Stat 502
A Stat 503 Example:The BARGE Study • Current standard treatment for patients with mild to moderate asthma is scheduled daily use of inhaled albuterol. • Now hypothesized that such regular use has a negative effect on lung function in patients with B16Arg/Arg genotype, but not in those with B16Gly/Gly genotype.
A Stat 503 Example:The BARGE Study • The BARGE Study concerns comparing the regular use of inhaled albuterol (A) to placebo (P) in patients with the B16Arg/Arg genotype (R) and in patients with the B16GlyGly genotype. • The primary hypothesis concerns inference about whether (μRA- μRP)- (μGA- μGP) is 0.
Stat 504: Analysis of Discrete Data • Analysis of data when: • the response y is binary or discrete • the predictors are qualitative or quantitative • Summarized data are frequency counts • How do the predictors affect the response?
Stat 504: Analysis of Discrete Data • Topics covered: • Models for frequency arrays • Goodness-of-fit tests • Two-, three- and higher-way tables • Latent models • Logistic and Poisson regression models • Prerequisites • Stat 502 (or undergraduate Stat 460 or major Stat 512) • Matrix algebra
A Stat 504 Example:Survival in the Donner Party • In 1846, Donner and Reed families traveled from Illinois to California by covered wagon. • Group became stranded in eastern Sierra Nevada mountains when hit by heavy snow. • 40 of 87 members (45 adults over age 15) died from famine and exposure. • Are females better able to withstand harsh conditions than are males?
A Stat 504 Example:Survival in the Donner Party Link Function: Logit Response Information Variable Value Count STATUS SURVIVED 20 (Event) DIED 25 Total 45 Logistic Regression Table Odds 95% CI Predictor Coef SE Coef Z P Ratio Lower Upper Constant 1.633 1.110 1.47 0.141 AGE -0.07820 0.03729 -2.10 0.036 0.92 0.86 0.99 Gender 1.5973 0.7555 2.11 0.034 4.94 1.12 21.72
Stat 505: Applied Multivariate Statistical Analysis • Analysis of data when you have several correlated, continuous responses is called multivariate data analysis. • A repeated measure is a special kind of multivariate response obtained by measuring the same variable on each subject several times, possibly under different conditions.
Stat 505: Applied Multivariate Statistical Analysis • Topics covered: • Multivariate data: matrix review, graphical displays, probability theory, multivariate normal distribution, partial correlations • Inferences about multivariate means: Hotelling’s T2 tests, multivariate analysis of variance, repeated measures experiments and growth curves, discriminant analysis • Data reduction: Principal components, factor analysis, canonical correlation analysis, cluster analysis • Structural equation modeling • Prerequisites: • 6 credits in statistics • Matrix algebra
A Stat 505 Example: Pottery Data • Pottery samples were collected from four sites in the British Isles: Llanedyrn, Caldicot, Isle Thornes, and Ashley Rails. • Each piece analyzed for its aluminum, iron, magnesium, calcium, and sodium content. • Do the pottery samples from the four sites differ with respect to their composition?
Stat 506: Sampling Theory and Methods • Topics covered: • Basic methods: simple random sampling, selecting sample sizes, unequal probability sampling, ratio and regression estimation, stratified sampling, cluster and systematic sampling, multistage designs, double sampling • Special topics: sampling hidden human populations, environmental sampling, sampling to study cause-and-effect relationships, resampling of data, measurement errors and nonresponse in surveys, adaptive sampling, network and snowball sampling • Prerequisites: • Calculus • 3 credits in statistics
A Stat 506 Example:A Water Pollution Survey • Study region of interest has 320 lakes. • Take random sample of the lakes by: • Drawing a rectangle of length l and width w around study region. • Generate pairs of (0,1) random numbers. Multiple first number by l, second by w to get random location coordinates within region. • If location is a lake, then lake is selected. • Continue until required number of lakes selected.
Stat 509: Biostatistics • Topics covered: • An introduction to the design and statistical analysis of randomized and observational studies in biomedical research • Prerequisites: • Stat 500
Stat 510: Applied Time Series Analysis • Topics covered: • Identification of models for empirical data collected over time • Use of models in forecasting • Prerequisites: • Stat 501 (or undergraduate Stat 462 or major Stat 511)
A Stat 510 Example:Measuring Global Warming • Temperature (in degrees Celsius) averaged for the northern hemisphere over a full year. • Temperature series collected from 1880 to 1987. • All measurements expressed as differences from their 108-year mean. • Research questions: • Is the mean temperature increasing over the 88 years? • What is the rate of increase in global temperature over the past century?