Позитивные изменения. Том 2, № 2 (2022). Positive changes. Volume 2, Issue 2 (2022)

Автор

Редакция журнала «Позитивные изменения»

Год написания книги

2023

Теги

<< 1 ... 16 17 18 19 20 21 22 23 24 ... 32 >>

На страницу:

Перейти

20 из 32

Настройки чтения

Размер шрифта

Высота строк

Поля

The treatment and comparison groups should be the same in at least three respects:

1. The baseline properties of the groups should be identical. For example, the mean age of the treatment group should be the same as that of the comparison one.

2. The program factor should not affect the comparison group directly or indirectly.

3. The results in the comparison group should change in the same way as the results in the treatment group if both groups were (or were not) enrolled in the program. That is, groups should respond to the program in the same way. For example, if the income in the treatment group increased by RUB 5,000 due to the training program, then the income in the comparison group would also increase by RUB 5,000 if they received the training.

When the above three conditions are met, then only the existence of the program under study will account for any differences in the outcome (Y) between the two groups.

Instead of considering the impact solely for one person, it is more realistic to consider the average impact for a group of people (Figure 1).

It is important to consider what happens if we decide to proceed with the evaluation without finding a comparison group. We may run the risk of making inaccurate judgments about program outcomes, in particular with regard to counterfactual evaluations.

Such a risk exists when using the following approaches:

• Before-and-after comparisons (also known as reflexive comparisons): comparing the outcomes of the same group prior to and subsequent to the introduction of a program.

• With-and-without comparisons: comparing the outcomes in the group that chose to enroll with the results of the group that chose not to enroll.

A before-and-after comparison attempts to establish the impact of the program by tracking changes in outcomes for program participants over time. In essence, this comparison assumes that if the program had never existed, the outcome (Y) for program participants would have been exactly the same as their situation before the program. Unfortunately, in the vast majority of cases that assumption simply does not hold.

Consider, for example, the evaluation of a microfinance program for rural farmers. The program provides farmers with microloans to help them buy fertilizer to increase rice production. You observe that in the year before the start of the program, farmers harvested an average of 1,000 kilograms (kg) of rice per hectare. (Point B in Figure 2).

The microfinance scheme is launched, and a year later rice yields have increased to 1,100 kg per hectare. (Point A in Figure 2). If you try to evaluate impact using a before-and-after comparison, you have to use the pre-intervention outcome as a counterfactual. Applying the basic impact evaluation formula, you would conclude that the scheme had increased rice yields by 100 kg per hectare. (A-B)

However, imagine that rainfall was normal during the year before the scheme was launched, but a drought occurred in the year the program started. Because of the drought, the average yield without the microloan scheme would have been lower than В: say, at level D. In that case, as the before-and-after comparison assumes, the true impact of the program would have been A-D, which is larger than 100 kg.

Rainfall was one of many external factors that could have influenced the outcome of interest (rice yield) of the scheme over time. Similarly, many of the outcomes that development programs aim to improve, such as income, productivity, health or education, are affected by multiple factors over time. For this reason, the preintervention outcome is almost never a good estimate of the counterfactual.

Comparing those who chose to enroll to those who chose not to enroll ("with-and-without") constitutes another risky approach to impact evaluation. The comparison group, which independently chose the program, will provide another «counterfeit» counterfactual estimate. The choice occurs when participation in the program is based on the preferences or decisions of each participant. This preference is a separate factor on which the outcome of participation may depend. It is impossible to talk about the comparability of those who enrolled with those who did not enroll under such conditions.

The HISP pilot evaluation consultants, in their attempts to mathematically understand the results, made both the first and the second mistake in evaluating the counterfactual, but the program organizers, realizing the risk of bias, decided to find methods for a more accurate evaluation.

RANDOMIZED ASSIGNMENT METHOD

This method is similar to running a lottery that decides who is enrolled in the program at a given time and who is not. The method is also known as randomized controlled trials (RCTs). Not only does it give the project team fair and transparent rules for assigning limited resources to equally eligible population clusters, but it also provides a reliable method for evaluating program impact.

"Randomness" applies to a large population cluster having a homogeneous set of qualities. In order to decide who will be given access to the program and who will not, we can also generate a basis for a reliable counterfactual evaluation.

In a randomized allocation, each eligible unit (e.g., individual, household, business, school, hospital, or community) has the same probability of being selected for the program. When there is excess demand for the program, randomized assignment is considered transparent and fair for all participants in the process.

Insert 1 provides examples of the use of randomized distribution in practice.

Insert 1: RANDOMIZED CONTROLLED TRIALS AS A VALUABLE OPERATIONAL TOOL

Randomized assignment can be a useful rule for assigning program benefits, even outside the context of an impact evaluation. The following two cases from Africa illustrate how.

In C?te d'Ivoire, following a period of crisis, the government introduced a temporary employment program that was initially targeted at former combatants and later expanded to youth more generally. The program provided youth with short-term employment opportunities, mostly to clean or rehabilitate roads through the National Roads Agency. Young people in participating municipalities were invited to register. Given the attractiveness of the benefits, many more candidates applied than there were places available. In order to come up with a transparent and fair way of allocating the benefits among applicants, program implementers put in place a public lottery process. Once registration had closed and the number of applicants (say, N) in a location was known, a public lottery was organized. All candidates were invited to a public location, and small pieces of paper with numbers from 1 to N were put in a box. Applicants were then called one by one to come to draw a number from the box in front of all other candidates. Once the number was drawn, it was read aloud. After all applicants were called, someone would check the remaining numbers in the box one by one to ensure they match the applicants who did not turn up for the draw. If N spots were available for the program, the applicants having drawn the lowest numbers were selected for the program. The draw was organized separately for men and women. The public lottery process was well accepted by participants, and helped provide an image of fairness and transparency to the program in a post-conflict environment marked by social tensions. After several years of operations, researchers used this allocation rule, already integrated in the program, to conduct impact evaluation.

In Niger, the government started to roll out a national safety net project in 2011 with support from the World Bank. Niger is one of the poorest countries in the world, and the population of poor households eligible for the program greatly exceeded the available benefits during the first years of operation. Program implementers relied on geographical targeting to identify the departments and communes where the cash transfer program would be implemented first. This was feasible, as data was available to identify relative poverty or vulnerability status of the various departments or communes. However, within communes, very limited number of people could enroll in the program based on objective criteria. For the first phase of the project, program implementers decided to use public lotteries to select beneficiary villages within targeted communes. This decision was made in part because the available data to prioritize villages objectively was limited, and in part because an impact evaluation was being embedded in the project. For the public lotteries, all the village chiefs were invited in the municipal center, and the names of their villages were written on a piece of paper and put in a box. A child would then randomly draw beneficiary villages from the box until the quotas were filled. The procedure was undertaken separately for sedentary and nomadic villages to ensure representation of each group. After villages were selected, a separate household-level targeting mechanism was implemented to identify the poorest households, which were later enrolled as beneficiaries. The transparency and fairness of the public lottery was greatly appreciated by the village authorities, as well as by program implementers – so much so that the public lottery process continued to be used in the second and third cycle of the project to select over 1,000 villages throughout the country. Even though public lottery was not necessary for an impact evaluation at that point, its value as a transparent, fair, and widely accepted operational tool to allocate benefits among equally deserving populations justified its continued use in the eyes of program implementers and local authorities.

    Sources: Bertrand, Marianne, Bruno Crеpon, Alicia Marguerie, and Patrick Premand. Impacts ? Court et Moyen Terme sur les Jeunes des Travaux ? Haute Intensitе de Main d’oeuvre (THIMO): Rеsultats de l’еvaluation d’impact de la composante THIMO du Projet Emploi Jeunes et Dеveloppement des compеtence (PEJEDEC) en C?te d’Ivoire. Washington, DC: Banque Mondiale et Abidjan, BCPEmploi. 2016
    Premand, Patrick, Oumar Barry, and Marc Smitz. "Transferts monеtaires, valeur ajoutеe de mesures d’accompagnement comportemental, et dеveloppement de la petite enfance au Niger. Rapport descriptif de l’еvaluation d’impact ? court terme du Projet Filets Sociaux." Washington, DC: Banque Mondiale. 2016.

WHY DOES THE RANDOMIZED ASSIGNMENT METHOD WORK WELL?

As discussed, the ideal comparison group should be as similar as possible to the treatment group in all respects, except with respect to its participation in the program that is being evaluated. When we randomly assign units to treatment and comparison groups, that randomized assignment process in itself will produce two groups with a high probability of being statistically identical – as long as the number of potential units to which we apply the randomized assignment process is sufficiently large.

Figure 3 illustrates why randomized assignment produces a comparison group that is statistically equivalent to the treatment group.

To estimate the impact of a program using randomized assignment, we simply take the difference between the outcome under treatment (the mean outcome of the randomly assigned treatment group) and our estimate of the counterfactual (the mean outcome of the randomly assigned comparison group). We can be confident that our estimated impact constitutes the true impact of the program, since we have eliminated all observed and unobserved factors that might otherwise plausibly explain the difference in outcomes.

Inserts 2 and 3 give real-world applications of randomized assignment to evaluate the impact of a number of different interventions around the world.

Insert 2: RANDOMIZED ASSIGNMENT AS A PROGRAM ALLOCATION RULE: CONDITIONAL CASH TRANSFERS AND EDUCATION IN MEXICO

The Progresa program, now called "Prospera," provides cash transfers to poor mothers in rural Mexico conditional on their children's enrollment in school and regular health checkups. The cash transfers, for children in grades 3 through 9, amount to about 50 percent to 75 percent of the private cost of schooling and are guaranteed for three years. The communities and households eligible for the program were determined based on a poverty index created from census data and baseline data collection. Because of a need to phase in the large-scale social program, about two-thirds of the localities (314 out of 495) were randomly selected to receive the program in the first two years, and the remaining 181 served as a comparison group before entering the program in the third year. Based on the randomized assignment, Schultz (2004) found an average increase in enrollment of 3.4 percent for all students in grades 1–8, with the largest increase among girls who had completed grade 6, at 14.8 percent. The likely reason is that girls tend to drop out of school at greater rates as they get older, so they were given a slightly larger transfer to stay in school past the primary grade levels. These short-term impacts were then extrapolated to predict the longer-term impact of the Progresa program on schooling lifetime and earnings.

    Source: Schultz, Paul. "School Subsidies for the Poor: Evaluating the Mexican Progresa Poverty Program." Journal of Development Economics 74 (1): 199–250. 2004

Insert 3: RANDOMIZED ASSIGNMENT OF SPRING WATER PROTECTION TO IMPROVE HEALTH IN KENIA

The link between water quality and health impacts in developing countries has been well documented. However, the health value of improving infrastructure around water sources is less evident.

Kremer et al (2011) measured the effects of a program providing spring protection technology to improve water quality in Kenya, with random assignment of springs to receive the treatment.

Approximately 43 percent of households in rural Western Kenya obtain drinking water from natural springs. Spring protection technology seals off the source of a water spring to reduce contamination.

Starting in 2005, the NGO International Child Support (ICS) implemented a spring protection program in two districts in western Kenya. Because of financial and administrative constraints, ICS decided to phase in the program over four years. This allowed evaluators to use springs that had not received the treatment yet as the comparison group.

From the 200 eligible springs, 100 were randomly selected to receive the treatment in the first two years. The study found that spring protection reduced fecal water contamination by 66 percent and child diarrhea among users of the springs by 25 %.

    Source: Kremer, Michael, Jessica Leino, Edward Miguel, and Alix Peterson Zwane. "Spring Cleaning: Rural Water Impacts, Valuation, and Property Rights Institutions." Quarterly Journal of Economics 126: 145–205. 2011

WHEN CAN RANDOMIZED ASSIGNMENT BE USED?

Randomized assignment can be used in one of the two scenarios as follows:

1. When the eligible population is greater than the number of program spaces available. When the demand for a program exceeds the supply, a lottery can be used to select the treatment group within the eligible population. The group that wins the «lottery» is the treatment group, and the rest of the population that is not offered the program becomes the comparison group. As long as a constraint exists that prevents scaling the program up to the entire population, the comparison groups can be maintained to measure the short-term, intermediate, and long-term impacts of the program.

2. When a program needs to be gradually phased in until it covers the entire eligible population. When a program is phased in, randomization of the order in which participants receive the program gives each eligible unit the same chance of receiving treatment in the first phase or in a later phase of the program. Until the last group joins the program, it serves as a valid comparison group from which the counterfactual for the groups that have already been phased in can be estimated. This setup also allows evaluating the effects of differential exposure to treatment: that is, the effect of receiving the program for a longer or shorter time.

STEPS IN RANDOMIZED ASSIGNMENT

Step 1 – Define the units eligible for the program. Remember that depending on the particular program, a unit could be a person, a health center, a school, a business, or even an entire village or municipality.

Step 2 – Select a sample of units from the population to be included in the evaluation sample.

This second step is done mainly to limit data collection costs. If it is found that data from existing monitoring systems can be used for the evaluation, and that those systems cover the full population of eligible units, then a separate evaluation sample may not be needed.