Introduction to Causal Inference in the Social Sciences

Introduction to Causal Inference in the Social Sciences Yu Xie The University of Michigan

Causal Questions • Example A: Is UM’s affirmative action policy educationally beneficial to its students? • Example B: Did the war in Iraq help or harm world peace in the long run? • A causal question is a simple question involving the relationship between two theoretical concepts: a cause and an effect. • Cause => Effect? • Or, X => Y?

Centrality of Causality in Social Science • The primary aim of all sciences (from Aristotle to modern genetics). • Understanding of causal relationships leads to accurate predictions of the future. • It provides the scientific basis for policy intervention. • It advances our theoretical knowledge of the world.

Evaluation Research • In high demand by policy makers. • Definition: Evaluation research, or program evaluation, refers to the kind of applied social research that attempts to evaluate the effectiveness of social programs. • Key to all evaluation research is causal inference: i.e., evaluating effectiveness of programs.

Simple Comparisons • One simple way is to compare units of analysis affected by the program to those unaffected by the program. • Say in a community, N1 children attended Head Start, and N2 did not. 27 years later, measure the educational attainment of the two groups, y1 (outcome among those who attended Head Start) and y2 (outcome among those who did not attend Head Start).

Simple Comparisons (Continued) • We compute difference in mean 13 - 14 = -1. • Should we conclude from this that Head Start has a negative effect on educational attainment? • The Westinghouse report in the earlier 60s. • The appropriate research question is not to compare observed y1 and observed y2.

What Might be Wrong with Simple Comparison? • The observed, bivariate relationship between Head Start participation and educational outcomes may be negative. SES Education + - + Head Start

UC-Berkeley Graduate Admission Data • Two-way table of sex and admission outcome is as follows:

If we break down the data by the largest majors

What is Going On? Sex Major Admission

What Should We Conclude from the UC-Berkeley Data? • There is a strong segregation of major by sex. • Admission rates vary greatly by major: low for feminine majors and high for masculine majors. • If there is anything, women seem to have an advantage in major A.

More Examples • Does cohabitation decrease or increase the likelihood of divorce? • Is it better to have more siblings or fewer siblings for educational attainment? • What is the earnings return to college education?

Causal Effect as a Counter-Factual Question • For causal inference, one should ask the counter-factual question, for those who received “treatment”, what would have happened to them if they hadn't been treated? • Or, y1t - y1c(t denoting treatment; c denoting control) • Note that y1t is observed, buty1cis not.

Causal Effect as a Counter Factual Question (continued) • For those who did not receive treatment, what would have happened to them if they had been treated? • Or, y2t - y2c(t denoting treatment; c denoting control) • Note that y2c is observed, buty2t is not. • The problem is one of missing data.

Assumption for Simple Comparison • If subjects who are treated are, on average, “comparable” to subjects who are untreated (which can be achieved by randomization) we can assume away the problem by averaging: • E(y1c)= E(y2c) , E(y1t)= E(y2t) • In that case, • E(y1t - y1c)=E(y2t - y2c) = E(y1t - y2c) • I.e, simple comparison is valid

Observable Selectivity Bias • If subjects who receive treatment and those who do not are different only in observed characteristics, this type of selectivity is called observable selectivity. • This problem can be handled by statistical controls in multivariate analysis to make the two groups comparable (or, differences between the two groups are “ignorable” conditional on covariates). • Often called “omitted variable bias.” • This is the basis for multivariate analysis.

Conditions for Omitted Variable Bias • (1) Correlation Condition: The omitted variable is correlated with the independent variable of primary interest; • (2) Relevance Condition: The omitted variable affects the dependent variable. • If one of the two conditions is not met, an omitted variable does not introduce a bias. • E.g., wedding expenses have been found to have a positive effect on marital stability. Could this be due to omitted variable biases?

Unobservable Selectivity • The more difficult problem is to deal with selectivity in unmeasured characteristics. • Two situations: • (1) “Heterogeneity” Bias: the two groups are systematically different due to predetermined unobservables. E.g., ability in human capital models. • (2) “Endogeneity” Bias: the effects of program participation are different between the two groups. E.g., self-selection. • Difficult to handle. Statistical models require strong and implausible assumptions.

Review • Population is divided into two subpopulations: P1 if Di =1, P0 if Di=0. • Use the following notations: • q = proportion of P0 in P • E(Y1T) = E(YT|D=1) , E(Y1C) = E(YC|D=1) • E(Y0T) = E(YT|D=0) , E(Y0C) = E(YC|D=0) • By total expectation rule: • E(YT - YC) = E(Y1T – Y1C)(1-q) + E(Y0T – Y0C)q = E(Y1T – Y0C) - E(Y1C – Y0C) - (d1-d0)q, where d1 =E(Y1T – Y1C), d0 =E(Y0T – Y0C). • Or: • E(Y1T – Y0C) = E(YT - YC) + E(Y1C – Y0C) + (d1-d0)q.

Experimental Approach • Experimental design eliminates both types of problems. • Example: High/Scope Perry Preschool study conducted in Ypsilanti. • Manski and Garfinkel (1992): experimental designs suffer from shortcomings that are often overlooked. • Manski and Garfinkel refer to experimental approach as “reduced-form.”

Shortcomings of Experimental Approach • We cannot always extrapolate results from an experimental setting to natural setting. • Thus, Manski and Garfinkel openly criticize experimental designs:"In fact, reduced-form experimental evaluation actually requires that a highly specific and suspect structural assumption hold: Individuals and organizations must respond in the same way to the experimental version of a program as they would to the actual version." (p.17) • I.e., lacking “external validity.”

Structural Approach • Manski and Garfinkel propose the "structural" approach as an alternative. • Definition: structural approach refers to statistical methods that model causal processes based on observational data. • Head Start example: control on SES, parental involvement, etc. • Requires strong social science theories.

Structural vs. Reduced-Form Equations • 1. Structural EquationsStructural equations are theoretically derived equations that often have endogenous variables as independent variables. • 2. Reduced-Form EquationsReduced-form equations are equations in which all independent variables are exogenous variables. I.e., in reduced-form equations, we purposely ignore intermediate variables.

Comparison of the two Approaches Advantages of Structural Approach: • Since it is conducted in a natural setting, its findings are directly relevant to the whole population. In contrast, results from an experimental design need to be extrapolated. • It is less costly. In contrast, experimental research is very expensive. • It builds upon and contributes to theory. In contract, the reduced-form approach only yield simple answers to simple questions.

Advantages of Reduced-form Approach • Biases due to unobservables can be eliminated through randomization. • It requires fewer assumptions. • It does not require complicated statistical models that the public and government officials have difficulty understanding.

Research Design Approaches • Quasi-Experiment • Utilizing spatial variation • Utilizing temporal variation • Clustering Design • Fixed effects model • Instrumental-Variable Estimation • Special type of structural approach

Example: Quasi-Experiment Design Utilizing Spatial Variation • Certain policies are introduced in State A but not in State B. • States A and B are otherwise comparable. • Observe how outcome Y differs between State A and State B. • Pace of economic reforms in China differs greatly by region • Associate regional variation in returns to education to regional variation in depth of economic reforms.

Example: Quasi-Experiment Design Utilizing Temporal Variation • Declining significance of race? • Examine temporal changes in SES differences by race • Hope to see a narrowing of racial gaps, particularly after the civil rights movement. • Effect of a new instructional method:

Propensity Score • P(T)=probability of treatment, balancing score for the probability of treatment. • Could be a function of other observed variables, z vector. Summary difference on all covariates. • We can estimate P(t) through a logit model: • logit(P) = b’z. • Under the assumption of no other relevant factors, group T and group C are comparable within levels of the estimated propensity score. • Different uses of propensity score: stratification, matching, regression covariates.

Instrumental-Variable Approach • Condition: IV Z does not affect Y except through X, meaning: • Z is correlated with Y but does not affect Y directly (called “exclusion restriction”). • Z is also correlated with X but not perfectly. • It’s very hard to find a good Z. Y X Z U

Example: Fixed Effects Model • Sibling models • Family SES, environment are shared • Yi1 =b0 + b1Xi1 + ai + ei1 • Yi2 =b0 + b1Xi2 + ai + ei2 • Take difference between the two eq. • Yi2 -Yi1=b1 (Xi2 -Xi1)+ (ei2- ei1) • Resulting in a more robust equation • Properties of the fixed effects approach: • All fixed-characteristics are controlled • It wastes a lot of information • Unobserved heterogeneity is controlled at the group level (fixed effects)

Heckman’s Selection Model Latent Rule

Different Quantities of Interest • Necessary because of the “variability principle.” • Treatment effects differ by population elements and thus could vary across subgroups.

Different Quantities of Interest • ATE=average treatment effect: • E(Yt-Yc) • ATT=average treatment effect of the treated (Heckman): • E(Yt-Yc|D=1) • LATE=local average treatment effect (in an exeriment): • E(Yt-Yc|compliance=1)

Example of Xie and Wu (2005) • The research question: what is the causal effect of entry into market sector on earnings? • Two causal questions:

New Entrants to the State Sector (522) State Sector Stayers (1337) Experienced Workers (1197) Stayers (1068) Stayers (1590) p1=0.11 p2=0.16 d=1 d=2 Market Sector Later Entrants (253) Early Birds (129) Early Birds (129) 1978 1987 1996 Year Figure 1. Flow Chart of Labor Market Transitions in China, 1978 – 1996.

Figure 2.Market Treatment Effect on Earnings by Propensity Strata: Later Entrants vs. Stayers

Conclusions of Xie and Wu (2005) • Nogeneric market effect on earnings. • First, only late transition into the market sector is associated with higher earnings. • Even among later entrants, the benefit of working in the market sector sharply decreases with the propensity of having made the transition. • These results illustrate endogeneity: individuals select their “treatment” based on the anticipated outcome, which is not homogeneous across workers.

Introduction to Causal Inference in the Social Sciences

Introduction to Causal Inference in the Social Sciences

Presentation Transcript

Causal Inference in Epidemiology

Causal Inference

CAUSAL INFERENCE IN STATISTICS

Design Approaches to Causal Inference

Causal Inference or Truth in the Universe

Introduction to the Social Sciences

Synthesis: Causal Inference

Introduction to Social Sciences

Causal Inference

Causal inference

Introduction to Social Sciences

Causal Inference or Truth in the Universe

Causal inference in cue combination

Mediation: The Causal Inference Approach

INTRODUCTION TO THE SOCIAL SCIENCES

Causal Inference

BA116IU Introduction to Social Sciences

Causal inference

Measuring variations Causality and causal modelling in the social sciences

Causal Inference

CAUSAL INFERENCE

Causal Inference