By Edoardo Masset | 3 June 2020
CEDIL recently awarded 25 grants to research teams to conduct a series of innovative evaluation studies. The awards were made before the pandemic outbreak and the study designs were presented at a workshop in Oxford on 26-27 February 2020 just as the crisis began to unfold in Europe. As the pandemic became global, we became aware that many of the CEDIL- supported research teams would have to change their plans to adapt to the crisis. Some research teams promptly modified their studies to collect additional information to improve our understanding of COVID-19 and the public reaction to it. For example, a study evaluating a youth skill training programme in Uganda, included additional phone interviews to the survey design in order to monitor the impact of COVID-19 and youth resilience and coping strategies. In another example, a study of Syrian business development and regional trade will include phone interviews to track the impact of COVID-19 on business activities and to understand how Syrian refugees are sharing information related to the pandemic.
In order to get a better understanding of the impact of the pandemic on impact evaluations we also asked the CEDIL research teams to report problems they were facing because of the COVID-19 emergency, and what their contingency plans were. This is a summary of the issues identified with a few reflections on their implications for the CEDIL programme and for impact evaluations more generally.
The growth of impact evaluations will slow down over the next year
COVID-19 is producing a cascade of delays at various stages of research and evaluation, which will result in a reduction in the number of impact evaluations in the next year. Over the last 10 years we have become accustomed to an exponential growth in impact evaluations but this growth is now likely to slow down. Of the 25 CEDIL-funded studies, only one based on secondary data is not anticipating any delay. Researchers are expecting long delays for studies involving data collection in the field. Typically, the reasons for the delays include: the failure to conduct inception workshops and stakeholder engagement meetings; the inability to carry out data collection and qualitative interviews; various travel restrictions within the country and international travel bans. Perhaps it is less obvious that the pandemic is causing delays even for desk-based research, such as systematic reviews, because of difficulties in remote working in countries where internet connections are poor, and because of personal circumstance mostly related to childcare responsibilities and home schooling.
It is not just the quantity but also the quality of impact evaluations that will be affected.
Research teams are tackling the challenge of conducting fieldwork while respecting social-distancing by replacing in-person interviews with computer-assisted telephone interviews (CATI), which will result in evaluations of poorer quality. Studies that are already under way or that are at the planning stage seem to have no alternative but to resort to phone interviews. This is a most common strategy outlined by a few of the CEDIL-funded teams and it is also the method universally adopted by studies that have been recently launched to study the pandemic and its repercussions – see for example the list of studies compiled by IPA at the RECOVR research portal.
There are several issues with conducting phone interviews and many of these might have available solutions. But I would like to highlight here two issues in particular that may affect the quality of impact evaluations. First, phone interviews tend to be short as it is difficult to hold people on the line for longer than 30 minutes. This greatly reduces the type and the number of outcome indicators that you can be collecting data for during interviews. Complex indicators such as income and expenditure cannot be measured. There is also the inability to monitor the work of enumerators who will carry out their interviews while in isolation. Data collected in this way are likely to be noisy. Second, not everybody has a phone and not everybody can be reached via a phone, and many people reached will refuse to participate in the surveys. It is often reported that the response rate in telephone surveys is less than 50%. This has a big selection bias implication. The respondents are likely to be different from the average population and are also likely to respond differently to treatments. In statistical terms, the two problems above have serious implications for data analysis. Noisy data will make it more difficult to find an effect when there is one, while selection bias makes it more likely that we will find the ‘wrong’ effect. The results of evaluations conducted in the next year or so are therefore likely to be less reliable.
We will see more quasi-experimental studies being produced.
Experimental designs will be most affected by COVID-19, which will result in a higher use of quasi-experimental designs. To some, this might seem like lowering the quality of the evidence produced, but this is not necessarily the case. I am not talking here of experiments designed to test interventions directed to limit transmission rates, such as for example social distancing, or randomised controlled trials (RCTs) designed to test policies, such as cash transfers, directed to mitigate the impact of COVID-19 on the population. I am thinking of the typical RCT that has been carried out in recent years to observe the impact of a particular intervention in education, health, social protection or other areas. Many of these RCTs are running at the moment, while others are being planned. Some of these experiments will become quasi-experimental designs and others will be discontinued. The reason is that COVID-19 is having a huge impact on all living standards measurements. In some cases, the target outcomes of these RCTs, for example school attendance or employment, will not be measurable as schools and businesses are closed. But even if schooling and economic activity continue, they will be severely affected by the pandemic, in such a way that the RCT will assess the impact of interventions under very special and totally uncontrolled circumstances. The range of applicability of the results obtained by these RCTs to other contexts will be limited.
As governments implement policies to mitigate the impact of the pandemic, it will make things worse for RCTs. The observed welfare impacts will now be the result of complex interactions between interventions, COVID-19 and policy responses. Disentangling these effects will be difficult and will require abandoning experimental designs in many cases.
The way forward
I want to conclude with a positive note on what researchers and evaluators could do in these difficult times. Impact evaluations will be more difficult to conduct over the next year, but researchers can do useful research in many other ways than impact evaluations. Not all useful research needs to use rigorous methods of causal inference. For example, COVID-19 is disproportionally affecting some communities and groups. There is a need to explain how and why this is happening. Also, there is much forecasting and modelling going on, but the parameters used by the forecasting models are often little more than guesses, and there is room for studies measuring these parameters.
Telephone interviews are an imperfect substitute for in-person contact. But there is room for introducing and developing new methods of data collection that do not rely on direct human contact. Obvious examples are satellite images or other images captured by cameras or videos. CEDIL is supporting a study by 3ie which is mapping all currently available technologies for collecting data in new ways. This study will provide guidance to researchers on new data collection methods beyond in-person and phone interviews.
Finally, a break in the wave of experimental evaluations will push researchers to use more of quasi-experimental methods. Here a few avenues seem particularly promising. First, there is room for conducting time series analysis (of the ‘interrupted time series’ type) and for exploiting discontinuities produced by COVID-19 and related interventions. Second, researchers can exploit existing datasets collected before the pandemic at national or local level. They can use these to organise follow-up studies to evaluate policies or the impact of COVID-19 in the long term. Third, as the transmission mechanisms of COVID-19 are better understood, we might be able to explain why some areas and people are more affected than others, and then use matching methods to build comparable groups. Finally, researchers will likely be invited to explore heterogeneity of impact of COVID-19 and of mitigating policies. Computational methods such as those used in machine learning will then find valuable use in development evaluation.
Edoardo Masset is the deputy director of CEDIL’s research directorate.
To watch the recording of the CEDIL webinar on Impact evaluation in the time of a pandemic, click here.
Photo credit: Henitsoa Rafalia, World Bank