How can we evaluate complex interventions?

How can we evaluate complex interventions?

Edoardo Masset | 21st March, 2022

The CEDIL programme recently published a methods working paper on the topic of evaluating complex interventions in international development, along with a methods brief that uses examples to highlight the key considerations involved in choosing appropriate methods for evaluating complex interventions. In this blog post, Edoardo Masset summarizes the key points in the papers.

The evaluation of complex interventions was one of the reasons that led to the establishment of a Centre of Excellence on Development Impact and Learning (CEDIL), and CEDIL has commissioned a number of evaluations of complex interventions. But what are complex interventions? In a recent review we identify four types.

First are interventions with long causal chains. This case is familiar to all evaluators of development interventions. Interventions with long causal chains can be spelled out in complex theory-of-change diagrams or result chains. To achieve their goals, they have to meet a number of sequential steps that are supported or derailed by contextual factors. For example, the Educate! job training programme for secondary schools in Uganda promises higher employment and better living conditions. But to meet its target the programme needs to persuade schools to participate, materials must be available, children have to learn the right skill, which have to be demanded in the labour markets and so on.

A second type of complex interventions are multi-component programmes. Multi-component programmes include many activities in the hope that the outcome of the various activities altogether will be larger than the sum of the parts. For example, the BRAC ultra-poverty graduation programme offers a package of cash transfers, training, asset transfers, and financial inclusion with the goal of breaking the poverty trap. Cash transfers, training, asset transfers, or financial inclusion do not eradicate poverty when separately implemented, but their combination does.

Some interventions are much more complex than multi-component and long-causal chain projects. Portfolio interventions include many projects, which are implemented across several sectors in a country or across many countries. They state overarching goals such as empowering women, eradicating malaria, or reducing corruption, but do not say how these goals will be achieved. They involve many stakeholders and change their activities while implemented. For example, Feed the Future is a portfolio intervention to ‘combat global hunger, poverty and malnutrition’, which includes many different interventions to boost agricultural productivity and improve nutrition.

Finally, some interventions aim at changing how entire systems operate. System-level interventions set out to improve the “education system”, the “health system”, or the “market system”. They are too big and comprehensive to include a theory of change. Their activities are developed during implementation and adapt to emerging characteristics of the environment. For example, the Market System Development approach aims at reducing rural poverty by removing multiple constraints and by changing the overall market operational environment. In this approach, implementers are aware that in a system each intervention in one area will feed-back into other areas producing winners and losers.  

Evaluation challenges

The impacts of complex interventions are difficult to evaluate. They are difficult to unpack and control groups cannot be built because the interventions are implemented at scale. They also affect multiple outcomes that are difficult to measure or can only be observed in the long term. In addition, they change activities during implementation thus preventing the design of prospective studies, and they last for many years thus going beyond the standard (short) time-frame of commissioned studies and more. These difficulties lead evaluators to focus their efforts on the evaluation of small bits of complex interventions or not evaluate them at all.

But methods for the evaluation of complex interventions are available. In our paper, we reviewed the following: pragmatic RCTs, factorial designs, adaptive trials, synthetic controls, qualitative comparative analysis, system dynamics, and agent-based modelling. The list is not exhaustive and other methods are available such as, for example, process tracing and machine learning methods. Our choice was partly pragmatic – we could only review that many methods in detail – and partly substantive: we selected methods that make fewer assumptions in order to causally attribute impacts to interventions. The selected methods will not be able to evaluate all complex interventions in all circumstances but will cover a fair amount.

An important point to make is that randomised controlled trials (RCTs) are not well-equipped to evaluate complex interventions. They can only be used in the simplest type of complex interventions, those consisting of long causal chains. In fact, it is now common to evaluate long causal chain interventions using a combination of RCTs and qualitative process evaluation to understand why programmes work or do not work.

Adaptive trials and factorial designs

RCTs can evaluate multicomponent interventions only at a very high cost, but we discuss two other experimental designs – adaptive trials and factorial designs – which can be used in some cases. Adaptive trials are experiments that change the characteristics of the experiment, such as sample size or research questions, as the experiment is carried out. Factorial designs are experiments – not to be confused with multi-arm trials – that explore the impact of many different interventions and of their interactions. Both methods are exploratory and are aimed at identifying best interventions or combination of interventions rather than assessing effectiveness. They are particularly useful when agencies are uncertain about many different ways in which an intervention can be delivered and unwilling to test each possible combination. They are both unused in the evaluation literature, possibly because they are oriented to improve interventions rather than assessing effectiveness, and therefore less likely to be published in academic journals.

We recommend using qualitative comparative analysis and synthetic control methods when evaluating portfolio interventions. QCA works well with few cases and in exploring interactions between projects and their environment. The method requires stronger assumptions than those made in observational studies in order to attribute causality, but it can explore the factors associated with the success of portfolio interventions carried out in many countries at the same time. Synthetic control methods have been proved valuable in many applications but have not been used in the evaluation of development interventions. Synthetic controls can be used when a portfolio intervention is carried out in a country with the goal of changing a particular outcome in a significant way.

System dynamics and agent-based modelling

Finally, we include in our review two modelling methods: system dynamics and agent-based modelling. System dynamics and agent-based modelling were originally designed with the goal of understanding complex phenomena emerging from the interactions of multiple factors and producing phenomena such as tipping points and non-linear outcomes. They are most likely to help the evaluation of system-level interventions which are affecting multiple component of a social system. They have so far been used by researchers to understand complex interventions but in evaluation. In order to produce convincing results, they should be carried out alongside counterfactual studies and include robustness and sensitivity analyses.

All the above methods are not meant to be used in isolation, and should preferably be used in combination with other methods. None of the above methods can assess all the aspects of a large complex intervention. But any one of these methods can be helpful in some circumstances. The important point is that rigorous methods are available for the evaluation of complex interventions, and that we should try whatever method is available before relying on desk reviews, key informant interviews, and secondary data analyses as it is often the case. As mentioned, complex interventions demand large resources and potentially produce far-reaching results. We need to avoid the ‘complexification’ of interventions, by which we mean the claim by programme managers and researchers that an intervention is too complex to be evaluated. Our paper makes a step in this direction.

Edoardo Masset is Deputy Research Director for CEDIL

Image credit: WorldFish

Leave a Reply