Big data for evaluating development outcomes
Programme of work
Evaluating complex interventions
Francis Rathinam, Sayak Khatua, Zeba Siddiqui, Manya Malik, Pallavi Duggal, Xavier Vollenweider, Samantha Watson
International Initiative for Impact Evaluation (3ie)
This systematic gap map addresses the following questions:
- How have different types of big data and methods been used for measuring and evaluating development outcomes?
- How dispersed or concentrated is the use of big data across Sustainable Development Goals (SDG) and geographies?
- What are the potential biases, measurement reliability issues, pros and cons, risks and ethical issues in using big data for measuring and evaluating development outcomes?
- What are currently some of the unexplored but promising applications of big data for impact evaluations?
This systematic gap map is based on the 3ie methodology and process for evidence gap maps. To create this gap map, the authors used systematic methods, such as database searches and data extraction, to identify any completed and ongoing impact evaluations, systematic reviews and big data measurement studies that evaluate or measure development outcomes. The studies identified were mapped on to the framework of big data sources and development (SDG) outcomes. The final output is a visual display of the volume of evidence for data sources-outcome combination, the type of evidence (impact evaluation, systematic reviews, measurement studies, completed or ongoing), a confidence rating for systematic reviews reflecting the study quality, an indication of research gaps at the data source, SDG outcome, geography level, and whether the studies and data are openly accessible.
The authors drew on impact evaluations, systematic reviews and measurement studies found in academic databases, repositories and organisational databases. They looked at studies using different types of big data, which included:
- Human-sourced information from social networks, crowd sourcing, citizen reporting;
- Process-mediated sources such as administrative data, call details record, e-transactions; and
- Machine-generated data from automated systems, including information from sensors and machines that measure and record events and situations in the physical world.
For evaluators, evaluation commissioners and policymakers, the map will highlight what data collection methods are available in difficult contexts, their relative benefits and costs, and the reliability of the data collected.
- Satellite images and mobile call detail records are the most used big data sources.
- The development themes studied the most include environmental sustainability, economic development and livelihoods, urban development, health and well-being, and energy, industry and infrastructure provision
- Interventions and outcomes that have spatial dimension are more likely to be measured using big data. Some of the lesser studied outcomes include agriculture, education and water.
- Studies are evenly spread across the continents.
- While there are a number of studies that have used big data for measuring various development outcomes, there are not many impact evaluations that have used these innovative big data-based outcome measures.
- Impact evaluations fare better than measurement studies in reporting on data quality issues and transparency, but less than 18 per cent of them have data publicly available.
- This systematic map shows how innovative, new data sources are being used in evaluating development outcomes, and more importantly where there is more potential to use big data in the future evaluations.
- This map shows that big data can contribute to the evidence base in development sectors where evaluations are not generally feasible due to data deficiency.
- Given the fast-growing availability of big data and improving computation capacity, there is great potential for using big data in future impact evaluations, particularly for measuring impact at higher frequency and with greater granularity.
- There are several sources of pre-processed satellite data that could be used in evaluations directly without the evaluators having to process them using complex machine learning models themselves
- Big data is a complement and not a replacement for traditional forms of data collection. There is still a critical role for locally gathered data to train machine learning algorithms, ground-truthing to validate variables generated using big data, and mixed-method fieldwork to help tell the story of what is happening on the ground.
- Donors can be introducing best practices and ethical standards, and facilitating more interaction among remote sensing scientists, big data analysts and development evaluators.
- It is important to prioritise meaningful stakeholder engagement, including policymakers, implementers, and clients. There is a need to make sure the advancing technologies of big data capture and processing do not disadvantage local researchers evaluating their own communities.
Rathinam, F., Khatua, S., Siddiqui, Z., Malik, M., Duggal, P., Watson, S, and Vollenweider, X. 2020. Using big data for evaluating development outcomes: a systematic map. CEDIL Methods Working Paper. Oxford: Centre of Excellence for Development Impact and Learning (CEDIL).
Online map citation: Rathinam, F, Khatua, S, Siddiqui, Z, Malik, M, Duggal, P, Watson, S, Vollenweider, X. 2020. Using big data for evaluating development outcomes: a systematic map [Online]. 3ie. Available here.