Administrivia

Administrivia • Lab 2 distributed this Thurs

Introduction to Experiments

Research Design Correlational Experimental • Variation comes from nature • Researcher draws conclusions from hands-off observation • Researcher manipulates variables directly • In a pure experiment, conditions are randomly assigned

Experiments vs Non-Experiments • Experiment is a study in which variation is introduced by the researcher • The type of experiment (such as a true, randomized experiment) depends on several factors– the elements of design dictate the type of experiment you have • Correlational studies include responses from natural groups (e.g., most surveys are non-experiments) • Just because you manipulate something, it does not make it an experiment. http://imgs.xkcd.com/comics/the_data_so_far.png

Why Experiments? • Because we want to test specific, directional hypotheses. • Because we want to isolate cause from effect. • Because we want tight control of conditions to rule out spurious relationships. • Experiments can be useful for exploration, but only when we want to isolate and explore specific mechanisms. • Because we need carefully-constructed designs that can be replicated to verify findings across environments or situations.

Why Experiments? Consider Pro’s and Con’s of Smoking Research • We have decades of observational research on the link between smoking and health outcomes such as cancer. But, despite overwhelming correlational evidence, it is actually not possible to establish a causal relationship through observation alone.

Smoking Research Example • What would a “true experiment” of smoking and health look like? • Randomly assign people from the population to be smokers or non-smokers (principle of random assignment) • Long-term study over lifetime (since we would need to have enough time for smoking to have negative effects). • We might have to withhold treatments to help with negative health effects in order to get good data on negative health outcomes. • In sum, this would be a terribly unethical thing to do– which is why most researchers accept the overwhelming observational research even though no true experiment on the effects of smoking has ever been conducted by scientists. • But… that doesn’t mean that researchers have not pushed the boundaries of ethics over the years in an attempt to isolate certain causes and effects…

First, a few famous Experiments…for good and bad reasons Tuskegee Syphilis Study: 1932-1972 clinical study of natural progression of untreated syphilis in rural African American men who thought they were getting free health care from US Government. Enrolled about 400 men who had syphilis but didn’t know it, and 200 without the disease. No one was told that they had syphilis, and those with the disease were not treated in order to observe the progression. Important because: researchers withheld known treatment to observe disease running its course. Led to the establishment of the Office for Human Research Protections (OHRP) and IRB’s.

Milgram Obedience Study • Famous study about role of obedience to authority figures. Conducted in early 1960’s, at a time when Nazi war criminals were on trial. • The larger question that Milgram was interested in was whether people like the Nazi’s were just “following orders”? Could any average person be pushed to harm others simply by the power of an authority asking him or her to behave a certain way?

Milgram Obedience Study Basic design: Experimenter asked the teacher to give questions to a student (also called a learner). The learner was a confederate (someone in on the study). The teacher would give electric shocks when the student did not give correct answers to questions. The experimenter would simply tell the teacher that “the study must continue”, no matter how high the electric shocks went. Conditions included putting the “student” and “teacher” in same room or not, as well as having the experimenter in the same room as the teacher or not. Important because: raised serious issues of ethics in lab studies, but also showed how tough concepts (such as obedience) could be studied in a controlled setting.

Stanford Prison Experiment Similar to Milgram’s study, the goal was to learn about the psychological effects of being a prison guard or a prisoner. University randomly assigned students to be “prisoners” or “guards”. Researchers watched to see how they behaved over time. Important because: it changed how studies are reviewed by IRB (Institutional Review Board). The psychological trauma to some of the participants raised an important issue about detrimental effects after a study is over.

Video Links to 3 Examples Above • Tuskegee Syphilis Study • http://www.c-spanvideo.org/program/Syp • Milgram Obedience Study • http://www.youtube.com/watch?v=fCVlI-_4GZQ • Stanford Prison Experiment • http://www.youtube.com/watch?v=sZwfNs1pqG0

In Sum… • Many, many types of experiments, some with small samples (such as lab experiments), others with thousands or even millions of participants (such as online field experiments). • Data scientists working with very large data often conduct experiments with users– and unlike academic research, these experiments are not necessarily disclosed to participants (e.g., Wikimedia foundation (Wikipedia), Google search data, Facebook, Insurance companies, rewards and purchasing programs at companies like Wal-Mart, Target, grocery stores.

In Sum (continued)… • Data scientists are especially interested in experiments because they let us test specific causal mechanisms in a way that may not be possible in other forms of data collection (such as scraping data from the web, or compiling data from multiple sources). • But, even if we could construct a good “true” experiment, we also have to balance issues of ethics. We have seen some classic examples of ethical problems, but that doesn’t mean the issue has gone away. Today, data scientists routinely come up against issues of privacy and data collection, security of sensitive information, and trust between those who collect and analyze information and those who provide it.

True Randomized Experimental Design • (1) Independent Variables must be manipulated (usually by experimenter, sometimes by context) • (2) Participants must be assigned randomly to various conditions or groups Courtesy http://psychology.ucdavis.edu/SommerB/

Example: Asch Conformity Experiments: “True Experiment” Design • Study: Would individuals conform to peer pressure even if they knew their peers were wrong? Participants sat in room with confederates and had to say which line (A, B, C) matched the line in the other box. • Independent variable manipulated by experimenter: • Whether there was any peer pressure to give incorrect answer or not. • Random Assignment to Control group or Treatment Group: • In treatment group, confederates answered incorrectly and the subject answered last • In control condition, no pressure to give incorrect answer (no confederates were present)

Meanings of the term “Controls” in Research • At least three different meanings: • Control in an Experiment • Controlled Studies • Controls used in a study or analysis (e.g., controlling for gender while examining relationship between education and income)

Active versus Attribute Independent Variables in Experiments • Active independent variable(s): • The I.V. is “given” to the participants, usually for some specified time period. It is often manipulated and controlled by the investigator. (e.g., presence or lack of peer pressure in the Asch conformity experiment) • Attribute independent variable(s): • A predictive, defining characteristic of individuals. Cannot be manipulated. (e.g., Gender of participants)

Two Different Types of Randomization: Randomization in Sample and Assignment • Random Sample • System for choosing participants from a population • Generally, larger sampling population leads to better generalizability. • Random Assignment • Method for assigning participants randomly to experiment conditions

Pitfalls in Assignment to Conditions • Salk Vaccine Trail • Huge vaccine study for polio. Wealthy families more likely to consent to study, but incidence of polio was correlated with quality of living environment. Again, allowing self-selection into study rather than true random assignment can affect the interpretation of results.

Pre-test, experimental manipulation and post-testing • Pre-test: allows us to check group equivalence before the intervention X is introduced. • Experimental manipulation/Treatment: An independent variable (X) that the experimenter manipulates. • Post-test: allows us to check group equivalence after intervention X has been introduced. Hypothesis  Random Assignment  Measure D.V.  Treatment  Measure D.V. (pre-test) (post-test)

Common types of true experiments O X O Pretest-Posttest Control R O O X O Post-only Control R O O X O (1) O O (2) R Solomon 4-group X O (3) O (4)

Three Major Types of Experiments • True experiments protect against both time and group threats to internal validity by randomly assigning subjects to treatment and control groups. The treatment (independent variable) is active. • If we cannot randomly assign subjects to treatment and control groups, then it is a quasi-experiment. The independent variable is active. • If we cannot randomly assign subjects to groups because the groups contain the attribute of interest, and if we give all groups the same treatment, then it is an associational non-experiment. The independent variable is not active. • We have already talked a lot about true experiments by example, so lets focus on the other two major types of experiments…

Example: Pen Study • Question: Do individuals in the US and Japan make different choices about ‘unique’ versus ‘less unique’ items when given a choice?

Pen Study • Independent Variables • Cultural difference: Japanese students compared to US students • Pen layout (4 pens of one color, 2 pens of another color) • Everyone received the same treatment • Assignment • Subjects were not randomly assigned because they already fell into one of the two societies. • Dependent Variable: • Would they choose the ‘common’ pen or the ‘unique’ one?

Example: Website Credibility Study • Question: Do people infer different amounts of credibility in websites with different URLs?

Web Credibility Study Example • Independent Variable • Experiment Condition (2 conditions): • Website has professional URL. • Website has dubious URL. (www.gooogle.com/webuygoldandsellprescriptions) • Assignment • We cannot control who goes to which website– we can only wait for enough individuals to go to each site and respond to the questionnaire about website credibility. • Dependent Variable • Credibility rating (a series of questionnaire items)

Internal Validity • Design Concerns • Internal Validity • External Validity • Ecological Validity Photo credit: http://www.indiana.edu/~p1013447/dictionary/int_val.htm

Internal Validity ? ? Internal Validity deals with questions about whether changes in the dependent variable were caused by the treatment. ? ? Cause Effect

Threats to Internal Validity • History • additional I.V. that occurs between pre-test and post-test • Maturation • Subjects simply get older and change during experiment • Testing • Subjects “get used” to being tested • Regression to the Mean • Issue with studies of “high and low extremes” on some key variable

Contamination and Internal Validity • Demand Characteristics • Anything in the experiment that could guide subjects to expected outcome

General Demand Characteristics • Three Key Demand Characteristics: • Eval Apprehension: Subjects know that they are being evaluated and this changes their behavior; creates anxiety and can interfere with task performance. • Social Desirablilty: Subjects act how they think the experimenter wants them to act, or what they think would be proper behavior • Placebo effect: just being treated or being given *anything* can lead to measurable improvement over nothing at all (e.g., sugar pill cures headache, etc).

Solutions to Demand Characteristics • Solutions: • Double-blind experiments • Experiments in natural setting (i.e., subjects do not know they are in an experiment) • Hidden measurements • Cover stories • Provide a reasonably believable story that explains what the researcher is doing, while concealing the actual intention of the study.

Expectancy Effects • Experimenter Expectancy • Researcher behavior that guides subjects to expected outcome (self-fulfilling prophecy)

Reducing expectancy effects • Blind • Researcher is unaware of the experiment condition that he/she is administering. • Naïve experimenter • Those conducting study are not aware of theory or hypotheses in the experiment • Standardization • Experimenter follows a script, and only the script • “Canned” Experimenter • Audio/Video/Print material gives instructions

And More Potential Problems with Experiments! • Selection Bias • Issue with non-random selection of subjects (e.g., using sample by convenience and mistaking this as random…eg., picking a few people “at random” from the same street corner) • Mortality/Attrition • Departure of subjects in the experiment • Diffusion, Sharing of Treatments • Control group unexpectedly obtains treatment • Other ‘social’ threats? • “Compensatory rivalry” and “resentful demoralization”: the control group may realize their status and be upset that they do not get the real treatment.

In Sum… • The validity of an experiment is subject to a large variety of threats. • You should be aware that experiments absolutely depend on stamping out these sources of bias. • This is the fun and the challenge of experimental designs

Three key threats to external validity (generalizability) in experiments External Validity: How far does the given study/experiment generalize to similar groups, individuals, etc? • Setting: Physical and social context of the experiment • Population: Is there something specific about the sample that interacts with the treatment? • History: Is there something about the time that interacts with the treatment?

Ecological Validity Ecological Validity: Approximation of ‘real-life’ situations This is such a tough one– for any lab experiment it is really very hard to get high ecological validity, even with computer-based studies.

The Validity Tradeoff: Truth and Myth Internal Validity External Validity Balance is important between the types of validity, but internal validity is usually (if not always) the more important factor.

Experiments in the Field • Some experiments can be conducted in a real-world setting while maintaining random assignment and manipulation of treatments

Field Experiment Example • Milliman (1986) Study of music tempo and restaurant customer behavior

Natural Experiments • Variation is provided by natural or unplanned event • Treatment is exogenous (no self-selection) • Natural disasters • Vietnam war draft • School admissions cutoffs • Olympics reduces air pollution • Distance from school to ISP central office • Provides an identification strategy: a way to identify a causal relationship in the data

Pro’s and Con’s of Experiments • Pro’s • Gives researcher tight control over independent factors • Allows researcher to test key relationships with as few confounding factors as possible • Allows for direct causal testing • Con’s • Often very small N in lab studies; enough for statistical purposes but not ideal for generalizability. • Sometimes give up large amounts of external validity in favor of internal validity (and vice versa) • Require a large amount of planning, training, and time– sometimes to test relationship between only 2 factors!

Administrivia

Administrivia

Presentation Transcript

Administrivia

Administrivia

Administrivia

Administrivia

Administrivia

administrivia

Administrivia

Administrivia

Administrivia

Administrivia

Administrivia

Administrivia

Administrivia

Administrivia

Administrivia

Administrivia

Administrivia!

Administrivia

Administrivia

Administrivia