1 / 293

Day 1

Day 1. designing randomized surveys. experimental design. Sampling Design. How do we gather data?. Surveys Opinion polls Interviews Studies Observational Retrospective (past) Prospective (future) Experiments. a list of every individual in the population. Sampling frame.

jada
Download Presentation

Day 1

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Day 1 • designing randomized surveys • experimental design

  2. Sampling Design

  3. How do we gather data? • Surveys • Opinion polls • Interviews • Studies • Observational • Retrospective (past) • Prospective (future) • Experiments

  4. a list of every individual in the population Sampling frame

  5. consist of n individuals from the population chosen in such a way that every individual has an equal chance of being selected every set of n individuals has an equal chance of being selected Simple Random Sample (SRS) Suppose we were to take an SRS of 100 MHS students – put each students’ name in a hat. Then randomly select 100 names from the hat. Each student has the same chance to be selected! Not only does each student has the same chance to be selected – but every possible group of 100 students has the same chance to be selected! Therefore, it has to be possible for all 100 students to be seniors in order for it to be an SRS!

  6. population is divided into homogeneous groups called strata SRS’s are pulled from each strata Stratified random sample Homogeneous groups are groups that are alike based upon some characteristic of the group members. Suppose we were to take a stratified random sample of 100 MHS students. Since students are already divided by grade level, grade level can be our strata. Then randomly select 50 seniors and randomly select50 juniors.

  7. select sample by following a systematic approach randomly select where to begin Suppose we want to do a systematic random sample of MHS students - number a list of students (There are approximately 2000 students – if we want a sample of 100, 2000/100 = 20) Select a number between 1 and 20 at random. That student will be the first student chosen, then choose every 20th student from there. Systematic random sample

  8. based upon location randomly pick a location & sample all there Cluster Sample Suppose we want to do a cluster sample of SWH students. One way to do this would be to randomly select 10 classrooms during 2nd period. Sample all students in those rooms!

  9. Advantages Unbiased Easy Disadvantages Large variance May not be representative Must have sampling frame (list of population) SRS

  10. Advantages More precise unbiased estimator than SRS Less variability Cost reduced if strata already exists Disadvantages Difficult to do if you must divide stratum Formulas for SD & confidence intervals are more complicated Need sampling frame Stratified

  11. Advantages Unbiased Ensure that the sample is distributed across population More efficient, cheaper, etc. Disadvantages Large variance Can be confounded by trend or cycle Formulas are complicated Systematic Random Sample

  12. Advantages Unbiased Cost is reduced Sampling frame may not be available (not needed) Disadvantages Clusters may not be representative of population Formulas are complicated Cluster Samples

  13. Identify the sampling design 1)The Educational Testing Service (ETS) needed a sample of colleges. ETS first divided all colleges into groups of similar types (small public, small private, etc.) Then they randomly selected 3 colleges from each group. Stratified random sample

  14. Identify the sampling design 2) A county commissioner wants to survey people in her district to determine their opinions on a particular law up for adoption. She decides to randomly select blocks in her district and then survey all who live on those blocks. Cluster sampling

  15. Identify the sampling design 3) A local restaurant manager wants to survey customers about the service they receive. Each night the manager randomly chooses a number between 1 & 10. He then gives a survey to that customer, and to every 10th customer after them, to fill it out before they leave. Systematic random sampling

  16. each entry is equally likely to be any of the 10 digits digits are independent of each other Numbers can be read across. Random digit table Numbers can be read vertically. The following is part of the random digit table found on page 847 of your textbook: Row 1 4 5 1 8 5 0 3 3 7 1 2 4 2 5 5 8 0 4 5 7 0 3 8 9 9 3 4 3 5 0 6 3 Numbers can be read diagonally.

  17. Suppose your population consisted of these 20 people: 1) Aidan 6) Fred 11) Kathy 16) Paul 2) Bob 7) Gloria 12) Lori 17) Shawnie 3) Chico 8) Hannah 13) Matthew 18) Tracy 4) Doug 9) Israel 14) Nan 19) Uncle Sam 5) Edward 10) Jung 15) Opus 20) Vernon Use the following random digits to select a sample of five from these people. We will need to use double digit random numbers, ignoring any number greater than 20. Start with Row 1 and read across. 1) Aidan 13) Matthew 18) Tracy 15) Opus 5) Edward Ignore. Ignore. Ignore. Ignore. Stop when five people are selected. So my sample would consist of : Aidan, Edward, Matthew, Opus, and Tracy Row 1 4 5 1 8 0 5 1 3 7 1 2 0 1 5 5 8 0 1 5 7 0 3 8 9 9 3 4 3 5 0 6 3

  18. ERROR favors certain outcomes Bias Anything that causes the data to be wrong! It might be attributed to the researchers, the respondent, or to the sampling method!

  19. things that can cause bias in your sample cannot do anything with bad data Sources of Bias

  20. People chose to respond Usually only people with very strong opinions respond Voluntary response An example would be the surveys in magazines that ask readers to mail in the survey. Other examples are call-in shows, American Idol, etc. Remember, the respondent selects themselves to participate in the survey! Remember – the way to determine voluntary response is: Self-selection!!

  21. Ask people who are easy to ask Produces bias results Convenience sampling The data obtained by a convenience sample will be biased – however this method is often used for surveys & results reported in newspapers and magazines! An example would be stopping friendly-looking people in the mall to survey. Another example is the surveys left on tables at restaurants - a convenient method!

  22. some groups of population are left out of the sampling process People with unlisted phone numbers – usually high-income families People without phone numbers –usually low-income families People with ONLY cell phones – usually young adults Undercoverage Suppose you take a sample by randomly selecting names from the phone book – some groups will not have the opportunity of being selected!

  23. occurs when an individual chosen for the sample can’t be contacted or refuses to cooperate telephone surveys 70% nonresponse Nonresponse Because of huge telemarketing efforts in the past few years, telephone surveys have a MAJOR problem with nonresponse! People are chosen by the researchers, BUT refuse to participate. NOT self-selected! This is often confused with voluntary response!

  24. occurs when the behavior of respondent or interviewer causes bias in the sample wrong answers Response bias Suppose we wanted to survey high school students on drug abuse and we used a uniformed police officer to interview each student in our sample – would we get honest answers? Response bias occurs when for some reason (interviewer’s or respondent’s fault) you get incorrect answers.

  25. Chapter 2 Experimental Design

  26. Definitions: 1) Observational study - observe outcomes without imposing any treatment 2) Experiment - actively impose some treatment in order to observe the response

  27. 3)Experimental unit – the single individual (person, animal, plant, etc.) to which the different treatments are assigned 4) Factor – is the explanatory variable 5) Level – a specific value for the factor

  28. 6) Response variable – what you measure 7) Treatment – a specific experimental condition applied to the units

  29. Iplan to test my new rabbit food. What are my experimental units? What is my factor? What is the response variable? Rabbits Type of food How well they grow

  30. Hippity Hop I’ll use my pet rabbit, Lucky! Since Lucky’s coat is shinier & he has more energy, then Hippity Hop is a better rabbit food!

  31. 8) Control group – a group that is used to compare the factor against; can be a placebo or the “old” or current item 9) Placebo – a “dummy” treatment that can have no physical effect

  32. Old Food Hippity Hop Now I’ll use Lucky & my friend’s rabbit, Flash. Lucky gets Hippity Hop food & Flash gets the old rabbit food. WOW! Lucky is bigger & shinier so Hippity Hop is better!

  33. Old Food Hippity Hop The first five rabbits that I catch will get Hippity Hop food and the remaining five will get the old food. The Hippity Hop rabbits have scored higher so it’s the better food!

  34. Old Food Hippity Hop 6 2 4 5 9 10 1 7 3 8 Number the rabbits from 1 – 10. Place the numbers in a hat. The first five numbers pulled from the hat will be the rabbits that get Hippity Hop food. The remaining rabbits get the old food. 5 8 7 3 9 I evaluated the rabbits & found that the rabbits eating Hippity Hop are better than the old food!

  35. 10) blinding - method used so that units do not know which treatment they are getting 11) double blind - neither the units nor the evaluator know which treatment a subject received

  36. Hippity Hop Rabbit Food Hippity Hop Rabbit Food makes fur soft and shiny, & increases energy for ALL types of rabbits! Can I make this claim?

  37. Principles of Experimental Design • Control of effects of extraneous variables on the response – by comparing treatment groups to a control group (placebo or “old”) • Replication of the experiment on many subjects to quantify the natural variation in the experiment • Randomization – the use of chance to assign subjects to treatments

  38. The ONLY way to show cause & effect is with a well-designed, well-controlled experiment! The ONLY way to show cause & effect is with a well-designed, well-controlled experiment!! The ONLY way to show cause & effect is with a well-designed, well-controlled experiment!!!

  39. Experiment Designs • Completely randomized – all experimental units are allocated at random among all treatments Random assignment

  40. Treatment B Treatment A Treatment C Treatment D Randomly assign experimental units to treatments Completely randomized design

  41. Randomized block – units are blocked into groups and then randomly assigned to treatments Random assignment

  42. Treatment B Treatment A Treatment A Treatment B Put into homogeneous groups Randomly assign experimental units to treatments Randomized block design

  43. Matched pairs - a special type of block design • match up experimental units according to similar characteristics & randomly assign one to one treatment & the other automatically gets the 2nd treatment • have each unit do both treatments in random order • the assignment of treatments is dependent

  44. Treatment B Treatment A Next, randomly assign one unit from a pair to Treatment A. The other unit gets Treatment B. Pair experimental units according to specific characteristics. This is one way to do a matched pairs design – another way is to have the individual unit do both treatments (as in a taste test).

  45. 12) Confounding variable – the effect of the confounding variable on the response cannot be separated from the effects of the explanatory variable (factor)

  46. Treatment B Treatment A One group is assigned to treatment A & the other group to treatment B. Treatment A Treatment B Confoundingdoes NOToccur in a completely randomized design! Treatment & group are confounded

  47. Randomization reduces bias by spreading any uncontrolled confounding variables evenly throughout the treatment groups. Is there another way to reduce variability? Blocking also helps reduce variability. Variability is controlled by sample size. Larger samples produce statistics with less variability.

  48. Exploring Data Day 2 Review Displaying data Bivariate data Simulations & Random Number Tables

  49. Types of variables

  50. Identify the following: • gender • age • hair color • smoker • systolic blood pressure • number of girls in class • categorical • numerical • categorical • categorical • numerical • numerical

More Related