Arianna Legovini Head, Development Impact Evaluation Initiative (DIME) World Bank Impact Evaluation for Real Time Decisi

Arianna Legovini Head, Development Impact Evaluation Initiative (DIME) World Bank Impact Evaluation for Real Time Decision-Making

Do we know… • What information and services will improve market conditions for farmers? –India soybeans • What payment system will secure the financial sustainability of irrigation schemes? –Ethiopia irrigation • What is the best way to select local projects? –Indonesia direct voting versus representatives’ decisions • Will local workforce participation improve construction and maintenance of local investments? –Afghanistan road construction

Trial and error • These are difficult questions… • We turn to our best judgment for guidance and pick a subsidy level, a voting scheme, a package of services… • Is there any other subsidy, scheme or package that will do better?

The decision process is complex • A few big decisions are taken during design but many more decisions are taken during roll out & implementation

Developing a decision tree for an irrigation scheme…

How to select between plausible alternatives? • Establish which decisions will be taken upfront and which will be tested during roll-out • Scientifically test critical nodes: measure the impact of one option relative to another or to no intervention • Pick better and discard worse during implementation • Cannot learn everything at once • Select carefully what you want to test by involving all relevant partners

Walk along the decision tree for your irrigation scheme to get more results

What is Impact Evaluation? Impact evaluation measures the effect of an intervention on outcomes of interest relative to a counterfactual (what would have happened in the absence of) It identifies the causal effect of an intervention on an outcome separately from the effect of other time-varying conditions

Impact evaluation Application of the scientific method to understand and measure human behavior • Hypothesis • If we subsidize fertilizer then farmers will use more fertilizer and increase production • Testing • Provide small discount with deadline after harvest or large subsidy before planting. Compare fertilizer use and productivity • Observations • Fertilizer use increases more with small discount with deadline • Production increases and then declines with fertilizer quantities • Conclusion • Timing the subsidy when farmers have financial resources is most effective

What is counterfactual analysis? Counterfactual analysis isolates the causal effect of an intervention on an outcome • Effect of subsidy on fertilizer use • Effect of information on market prices • Compare same individual with & without subsidy, information etc. at the same point in time to measure the effect • This is impossible • Impact evaluation uses large numbers (farmers, communities) to estimate the effect

What is a good counterfactual? Treated & counterfactual groups have identical observed and unobserved characteristics The only reason for the difference in outcomes is due to the intervention

How to define a counterfactual? Design impact evaluation before the intervention is rolled out Define eligibility Assign interventions to some and not some other eligible populations on a random basis or on the basis of clear and measurable criteria Obtain a treatment and a control groups Measure and compare outcomes in those groups over time

Nudging Farmers to Use Fertilizer: Evidence from Kenya (Duflo, Kremer, Robinson, 2009) • Farmers randomly selected into groups: • Free delivery offered for planting or top dressing fertilizer just after harvest No subsidy • 14.3 percentage point increase in fertilizer use relative to controls • Free delivery and 50% subsidy later during top dressing (1-2 months after planting) • 13.2 percentage point increase in fertilizer use relative to controls • Control group with none of the above

Nudging Farmers to Use Fertilizer Policy conclusions • Small, well-timed discounts can induce some farmers to purchase productive inputs • Time dimensions and farmer “impatience” may be important for technology adoption • Large, costly subsidies might not be appropriate policy response

How is this done? • Select one group to receive treatment (subsidy, information…) • Find a comparison group to serve as counterfactual • Use these counterfactual criteria: • Treated & comparison groups have identical initial average characteristics (observed and unobserved) • The only difference is the treatment • Therefore the only reason for the difference in outcomes is due to the treatment

Methods (tomorrow) • Experimental or random assignment • Equal chance of being in the treatment or comparison group • By design treatment and comparison have the same characteristics (observed and unobserved), on average • Simple analysis (means comparison) and unbiased impact estimates • Non-experimental (Regression discontinuity, IV and encouragement designs, Difference in difference) • Require more assumptions or might only estimate local treatment effects • May suffer from non-observed variable bias • Use more than one method to check robustness of results

How is monitoring different from impact evaluation? Y Before After B B’ Impact Impact evaluation • Change over time and relative to comparison • Compare results before and after in the “treated” group and relative to the “untreated” group A A Change t0 t1 Intervention Monitoring is trend analysis Change over time Compare results before and after on the “treated” group

Monitoring & Impact Evaluation monitoring to track implementation efficiency (input-output) BEHAVIOR • impact evaluation to measure effectiveness (output-outcome) MONITOR EFFICIENCY INPUTS OUTPUTS OUTCOMES EVALUATE EFFECTIVENESS $$$

Question types and methods • Monitoring and process evaluation • Is program being implemented efficiently? • Is program targeting the right population? • Are outcomes moving in the right direction? • Impact Evaluation • What was the effect of the program on outcomes? • How would outcomes change under alternative program designs? • Is the program cost-effective? Descriptive analysis Causal analysis

When would you use M&E and when IE? • Are grants to communities being delivered as planned? • Does participation reduce elite capture? • What are the trends in agricultural productivity? • Does agricultural extension increase technology adoption? • M&E • IE • M&E • IE

Separate performance from quality of intervention: babies & bath water Uganda Community-Based Nutrition • Failed project • Project ran into financial difficulties • Parliament negative reaction • Intervention stopped …but… • Strong impact evaluation results • Children in treatment scored half a standard deviation better than children in the control • Recently, Presidency asked to take a second look at the evaluation: saving the baby?

Why Evaluate? • Improve quality of programs • Separate institutional performance from quality of intervention • Test alternatives and inform design in real time • Increase program effectiveness • Answer the “so what” questions • Build government institutions for evidence-based policy-making • Plan for implementation of options not solutions • Find out what alternatives work best • Adopt better way of doing business and taking decisions

PM/Presidency: Communicate to constituencies Treasury/ Finance: Allocate budget CAMPAIGN PROMISES BUDGET Accountability Cost-effectiveness of different programs Effects of government program SERVICE DELIVERY Line ministries: Deliver programs and negotiate budget Cost-effectiveness of alternatives and effect of sector programs Institutional framework

Shifting Program Paradigm From: • Program is a set of activities designed to deliver expected results • Program will either deliver or not To: • Program is menu of alternatives with a learning strategy to find out which work best • Change programs overtime to deliver more results

Shifting Evaluation Paradigm • From retrospective, external, independent evaluation • Top down • Determine whether program worked or not • To prospective, internal, and operationally driven impact evaluation /externally validated • Set program learning agenda bottom up • Consider plausible implementation alternatives • Test scientifically and adopt best • Just-in-time advice to improve effectiveness of program over time

Retrospective (designed & evaluated ex-post) vs. Prospective (designed ex-ante and evaluated ex-post) • Retrospective impact evaluation: • Collecting data after the event you don’t know how participants and nonparticipants compared before the program started • Have to try and disentangle why the project was implemented where and when it was, after the event • Prospective evaluation: • design the evaluation to answer the question you need to answer • collect the data you will need 27

Is this a one shot analytical product? • This is a new model to change the way decisions are taken • It is about building a relationship between operations and research • Adds results-based decision tools to complement existing sector skills • The relationship delivers not one but a series of analytical products • Must provide useful (actionable) information at each step of the impact evaluation

Ethical considerations • It is not ethical to deny benefits to something that is available and we know works • HIV medicine proven to prolong life • It is ethical to test interventions before scale up if we don’t know if it works and whether it has unforeseen consequences • Food aid may impair local markets and create perverse incentives • Most times we use opportunities created by roll out and budget constraints to evaluate so as to minimize ethical considerations

Thank you Financial support from Is gratefully acknowledged

Arianna Legovini Head, Development Impact Evaluation Initiative (DIME) World Bank Impact Evaluation for Real Time Decisi