1 / 38

STAT131/171 W4L2 Modelling Variation: Introduction to modelling and GOF

STAT131/171 W4L2 Modelling Variation: Introduction to modelling and GOF. by Anne Porter alp@uow.edu.au. Activity: Let’s play beat the butcher. Morning radio 6am -7am, weekdays Contestant telephones in to play Contestant has to say stop before the gong rings to win the meat

cheung
Download Presentation

STAT131/171 W4L2 Modelling Variation: Introduction to modelling and GOF

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. STAT131/171W4L2 Modelling Variation: Introduction to modelling and GOF by Anne Porter alp@uow.edu.au

  2. Activity: Let’s play beat the butcher • Morning radio 6am -7am, weekdays • Contestant telephones in to play • Contestant has to say stop before the gong rings to win the meat • Radio personality reads the meat items: 2 slices of scotch fillet,…,3kg mince, until the gong is reached

  3. The list Let’s play, all stand, I’ll read, you sit when you have enough meat. Last ones standing before the gong win. 1) Three kilos scotch fillet 4) 12 chicken kebabs 5) 12 lamb kebabs 2) 1 chicken 3) 3 kilos of sausages 6) 3 livers 9) 2kg salmon rissoles 7) 1 kg bacon 8) lamb chops

  4. How might you increase your chances of winning?What information would be useful before you play again? • What is the maximum and minimum number of items ever read out? • What is the voice pattern over the gonged items? • What is the average number of items read out before the Gong? • What is the frequency of gongs over time for each item?

  5. Frequency distribution of the number of items before the gong What is a more informative way of presenting the data so we optimise our chance of where to stop?

  6. Relative frequency table What is a better way of presenting this information so it is easier to use?

  7. Cumulative frequency What will be the median number of items before the gong? Is the (n+1)/2th value =the 50.5th value =8

  8. Frequency distribution of the number of items before the gong What is the average number of items read before the gong?

  9. What do we do to calculate the mean number of items before the gong?

  10. What do we do to calculate the mean number of items before the gong? Multiply the number of items by the Frequency AND add to get the total number of items before the gong AND divide by the number of games played

  11. Calculate the mean

  12. Calculate the mean =784/100 =7.84 Items before the gong

  13. Will your stopping strategy be the same for this set of data? Why not?

  14. Will your stopping strategy be the same for this set of data? Why not? For these values of x we have a much smaller spread

  15. In the long run what should be the probability of stopping at each number if stopping at random?

  16. P(X=x) and number expected for each item for the random stopping model Does it appear that the data fit the random stopping model? Why so?

  17. P(X=x) and number expected for each item for the random stopping model Does it appear that the data fit the random stopping model? Why so? Number expected differs from number observed.

  18. Bar Chart: Compare observed & expected frequencies

  19. Measuring the difference between O and E How do we Measure (compare, calculate) the difference between observed and expected

  20. P(X=x) and number expected for each item for the random stopping model How might we calculate the difference between observed and expected If the data fits will this be big or small? small

  21. Calculating

  22. Calculating

  23. Calculating

  24. Model Fit Using • Calculate • And see if it is too large for the data to be considered to fit the model

  25. Model Fit Informal : Is too big? • If • Where d=g-p-1 • g is the number of cells • p is the number of parameters estimated from the data Then there is evidence the data does not fit the model For our example g= 10 cells therefore d=10-0-1=9 = 17.49 Decision: As =65.6 >17.49 there is evidence that the data do not fit the random stopping model

  26. Percentage Points of the distribution df a 0.995 0.99 0.05 0.025 0.01 0.005 1 3.841 5.024 6.635 7.879 9 1.735 2.088 16.919 19.023 21.666 23.589 Model Fit Formal • Decision: If calculated > critical value of (tables) then there is evidence of lack of fit a=0.05 (typical and we will use) df=Number of cells –number of estimated parameters-1 df =10-0-1=9

  27. Percentage Points of the distribution df a 0.995 0.99 0.05 0.025 0.01 0.005 1 3.841 5.024 6.635 7.879 9 1.735 2.088 16.919 19.023 21.666 23.589 Model Fit Formal • Decision: As calculated =65.6 > critical value of 16.919 found in the tables there is evidence of lack of fit between the data and the random stopping model.

  28. Lack of fit Looking at the table we can see most lack of fit occurs for items 2, 3, 8 and 9 lots of meat before the gong

  29. Sampling Distributions • We will explore how these types of sampling distributions, are generated in our lecture on sampling distributions. • We will also explore how we chose a value of a • We will look at using the data to estimate parameters later

  30. Model fit approaches • Use a Bar chart to compare observed and expected frequencies • Compare observed and expected frequencies • Calculate and use • Informally • Formally assumes that the expected counts in each cell is 5 If not combine cells. Other literature uses other rules, there is a debate over this. (Check the Utts& Heckard (2004) definition)

  31. Mean (expected value, E(X)) for the random stopping model

  32. Expected value for the random stopping model is? E(X)=6.5

  33. Spread of the Population Model We will leave calculation of these till a little later on a simpler example

  34. What have we been doing? • We have been looking at the centre, spread, outliers and shape of samples of data? • With a view to improving decision making. • Why are we concerned with looking at models?

  35. Describing characteristics of Data We collect data on samples • Time in seconds until two species of flies released together mate • The number of lost articles found in a large municipal office • The average carbohydrate content per 100 gm serve in a sample of different species • The number of items of meat read before the gong

  36. Improving our decisions Looking at • The shape of the distribution • Centre • Spread • Whether or not the data fit some model • May even look at outliers, points not fitting the model

  37. Describing Batches of Data • Comparing midterm marks from the different versions of the test. • Are the papers completed in a similar manner?

  38. What we are really looking at is NOT • The mating behaviour of these particular flies • Past lost articles • Or last years exam papers • Or the last 100 games of beat the butcher We are interested in them because they may suggest a model for the characteristics of the data in general. This involves Probability Models. We shall continue to explore probability models in future lectures.

More Related