1 / 23

Experimental Statistics - week 4

Experimental Statistics - week 4. Chapter 8: 1-factor ANOVA models Using SAS. EXAM SCHEDULE: Exam I – Take-home exam (handed out Thursday, March 3, due 8:00 AM Tuesday, March 8) Exam II – Take-home exam (handed out Thursday, April 14,

lara-wilson
Download Presentation

Experimental Statistics - week 4

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Experimental Statistics - week 4 Chapter 8:1-factor ANOVA models Using SAS

  2. EXAM SCHEDULE: Exam I – Take-home exam (handed out Thursday, March 3, due 8:00 AM Tuesday, March 8) Exam II – Take-home exam (handed out Thursday, April 14, due 8:00 AM Tuesday, April 19) Final Exam – optional(scheduled for 8:00 AM – 11:00 AM Friday, May 6) GRADE COMPUTATION: Exam Grades (75%) Daily Assignments (25%)

  3. ANOVA Table Output - hostility data - calculations done in class Source SS df MS F p-value Between 767.17 2 383.58 16.7 <.001   samples Within 205.74 9 22.86   samples Totals 972.91

  4. SPSS ANOVA Table for Hostility Data

  5. ANOVA Models Note: Example: Population has mean m = 5. Consider the random sample

  6. For 1-factor ANOVA

  7. General Form of Model: Alternative form of the 1-Factor ANOVA Model (pages 394-395) - random errors follow a Normal (N) distribution, are independently distributed (ID), and have zero mean and constant variance -- i.e. variability does not change from group to group

  8. Analysis of Variance Table Recall: In our model:

  9. Introduction to SAS Programming Language

  10. Recall CAR DATA For this analysis, 5 gasoline types (A - E) were to be tested. Twenty carswere selected for testing and were assigned randomly to the groups (i.e. the gasoline types). Thus, in the analysis, each gasoline type was tested on 4 cars. A performance-based octane reading was obtained for each car, and the question is whether the gasolines differ with respect to this octane reading.     A 91.7 91.2 90.9 90.6 B 91.7 91.9 90.9 90.9 C 92.4 91.2 91.6 91.0 D 91.8 92.2 92.0 91.4 E 93.1 92.9 92.4 92.4

  11. The CAR data set as SAS needs to see it: A 91.7 A 91.2 A 90.9 A 90.6 B 91.7 B 91.9 B 90.9 B 90.9 C 92.4 C 91.2 C 91.6 C 91.0 D 91.8 D 92.2 D 92.0 D 91.4 E 93.1 E 92.9 E 92.4 E 92.4

  12. SAS file for CAR data Case 1:  Data within SAS FILE : DATA one; INPUT gas$ octane; DATALINES; A 91.7 A 91.2 . . . E 92.4 E 92.4 ; PROC GLM; (or ANOVA) CLASS gas; MODEL octane=gas; TITLE 'Gasoline Example - Completely Randomized Design'; MEANS gas/duncans; RUN; PROC MEANS mean var; RUN; PROC MEANS mean var; class gas; RUN;

  13. Brief Discussion of Components of the SAS File: DATA Step DATA STATEMENT - the first DATA statement names the data set whose variables are defined in the INPUT statement -- in the above, we create data set 'one' INPUT STATEMENT - 2 forms 1.  Freefield- can be used when data values are separated by 1 or more blanks INPUT   NAME $  AGE SEX $   SCORE;         ($ indicates character variable)   2.Formatted - data occur in fixed columns INPUT    NAME $ 1-20  AGE 22-24  SEX  $ 26   SCORE 28-30; DATALINES STATEMENT       -  used to indicate that the next records in the file contain the actual data andthe semicolon after the data indicates the end of the data itself

  14. SPECIFYING THE ANALYSIS --  PROC STATEMENTS GENERAL FORM PROC xxxxx;implies procedure is to be run on most recently created data set   PROC xxxxx  DATA = data set name; Note:  I did not have to specify DATA=one in the above example Example PROCs: PROC REG - regression analysis PROC ANOVA - analysis of variance PROC GLM - general linear model PROC MEANS - basic statistics, t-test for H0: m = 0 PROC PLOT - plotting PROC TTEST - t-tests PROC UNIVARIATE - descriptive stats, box-plots, etc. PROC BOXPLOT - boxplots

  15. PROC GLM • Proc GLM data = fn ; • Class … ; • List all the factors. • Model … / options; • e.g., model octane = gas; • Means … / options; • Run;

  16. SAS Syntax • Every command MUST end with a semicolon • Commands can continue over two or more lines • Variable names are 1-8 characters (letters and numerals, beginning with a letter or underscore), but no blanks or special characters • Note: values for character variables can exceed 8 characters • Comments • Begin with *, end with ;

  17. Titles and Labels • TITLE ‘…’ ; • Up to 10 title lines: TITLE ‘include your title here’; • Can be placed in Data Steps or Procs • LABELname = ‘…’ ; • Can be in a DATA STEP or PROC PRINT • Include ALL labels, then a single ; Note:For class assignments, place descriptive titles and labels on the output. Print the data to the output file.

  18. Case 2:  Data in an External File FILENAME f1 ‘complete directory/file specification’; FILENAME f1 ‘a:car.data'; DATA one; INFILE f1; INPUT gas$ octane; PROC GLM; (or ANOVA) CLASS gas; MODEL octane=gas; TITLE 'Gasoline Example - Completely Randomized Design'; RUN; PROC MEANS mean var; RUN; PROC MEANS mean var; class gas; run;

  19. The SAS Output for CAR data: Gasoline Example - Completely Randomized Design General Linear Models Procedure Dependent Variable: OCTANE Sum of Mean Source DF Squares Square F Value Pr > F Model 4 6.10800000 1.52700000 6.80 0.0025 Error 15 3.37000000 0.22466667 Corrected Total 19 9.47800000 R-Square C.V. Root MSE OCTANE Mean 0.644440 0.516836 0.4739902 91.710000 Source DF Type I SS Mean Square F Value Pr > F GAS 4 6.10800000 1.52700000 6.80 0.0025 Source DF Type III SS Mean Square F Value Pr > F GAS 4 6.10800000 1.52700000 6.80 0.0025

  20. Text Format for ANOVA Table Output - car data Source SS df MS F p-value Between 6.108 4 1.527 6.80 0.0025   samples Within 3.370 15 0.225   samples Totals 9.478 19

  21. PC SAS on Campus Library BIC Student Center SAS Learning Edition $125 http://support.sas.com/rnd/le/index.html

  22. “Lab” Assignment Using CAR Data, run the following in this order with one set of code: 1. Calculate the average, standard deviation, minimum, and maximum for the 20 octane readings.CS pp. 25 - 32 2.Graph a histogram of OCTANE. CS pp. 37 3.Calculate descriptive statistics in (1) above for OCTANE for each of the 5 gasolines.CS pp. 32-34 A and B. CS pp. 138-141 5. Plot side-by-side box plots for OCTANE for the 5 levels of the variable GAS 6. Compute a 1-factor ANOVA for the CAR data using only the first 3 GAS types. CS pp.150-155

More Related