Big data for evaluating development outcomes

Programme of work

Evaluating complex interventions

Principal investigator(s)

Francis Rathinam, Sayak Khatua, Zeba Siddiqui, Manya Malik, Pallavi Duggal, Xavier Vollenweider, Samantha Watson

Host institution

International Initiative for Impact Evaluation (3ie)

Project type

Evidence synthesis

Country/ies

Multi-country

Research question

This systematic gap map addresses the following questions:

  • How have different types of big data and methods been used for measuring and evaluating development outcomes? 
  • How dispersed or concentrated is the use of big data across Sustainable Development Goals (SDG) and geographies?
  • What are the potential biases, measurement reliability issues, pros and cons, risks and ethical issues in using big data for measuring and evaluating development outcomes?
  • What are currently some of the unexplored but promising applications of big data for impact evaluations?

Research design

This systematic gap map is based on the 3ie methodology and process for evidence gap maps. To create this gap map, the authors used systematic methods, such as database searches and data extraction, to identify any completed and ongoing impact evaluations, systematic reviews and big data measurement studies that evaluate or measure development outcomes. The studies identified were mapped on to the framework of big data sources and development (SDG) outcomes. The final output is a visual display of the volume of evidence for data sources-outcome combination, the type of evidence (impact evaluation, systematic reviews, measurement studies, completed or ongoing), a confidence rating for systematic reviews reflecting the study quality, an indication of research gaps at the data source, SDG outcome, geography level, and whether the studies and data are openly accessible.

Data source

Data source

The authors drew on impact evaluations, systematic reviews and measurement studies found in academic databases, repositories and organisational databases. They looked at studies using different types of big data, which included:

  • Human-sourced information from social networks, crowd sourcing, citizen reporting; 
  • Process-mediated sources such as administrative data, call details record, e-transactions; and 
  • Machine-generated data from automated systems, including information from sensors and machines that measure and record events and situations in the physical world. 

Policy relevance

For evaluators, evaluation commissioners and policymakers, the map will highlight what data collection methods are available in difficult contexts, their relative benefits and costs, and the reliability of the data collected.

Main Findings

  • Satellite images and mobile call detail records are the most used big data sources.
  • The development themes studied the most include environmental sustainability, economic development and livelihoods, urban development, health and well-being, and energy, industry and infrastructure provision 
  • Interventions and outcomes that have spatial dimension are more likely to be measured using big data. Some of the lesser studied outcomes include agriculture, education and water.  
  • Studies are evenly spread across the continents. 
  • While there are a number of studies that have used big data for measuring various development outcomes, there are not many impact evaluations that have used these innovative big data-based outcome measures. 
  • Impact evaluations fare better than measurement studies in reporting on data quality issues and transparency, but less than 18 per cent of them have data publicly available. 

Implications

  • This systematic map shows how innovative, new data sources are being used in evaluating development outcomes, and more importantly where there is more potential to use big data in the future evaluations.
  • This map shows that big data can contribute to the evidence base in development sectors where evaluations are not generally feasible due to data deficiency.
  • Given the fast-growing availability of big data and improving computation capacity, there is great potential for using big data in future impact evaluations, particularly for measuring impact at higher frequency and with greater granularity.
  • There are several sources of pre-processed satellite data that could be used in evaluations directly without the evaluators having to process them using complex machine learning models themselves
  • Big data is a complement and not a replacement for traditional forms of data collection. There is still a critical role for locally gathered data to train machine learning algorithms, ground-truthing to validate variables generated using big data, and mixed-method fieldwork to help tell the story of what is happening on the ground.
  • Donors can be introducing best practices and ethical standards, and facilitating more interaction among remote sensing scientists, big data analysts and development evaluators.
  • It is important to prioritise meaningful stakeholder engagement, including policymakers, implementers, and clients. There is a need to make sure the advancing technologies of big data capture and processing do not disadvantage local researchers evaluating their own communities.

Publications

Rathinam, F., Khatua, S., Siddiqui, Z., Malik, M., Duggal, P., Watson, S, and Vollenweider, X. 2020. Using big data for evaluating development outcomes: a systematic map. CEDIL Methods Working Paper. Oxford: Centre of Excellence for Development Impact and Learning (CEDIL).

Additional links

Online map citation: Rathinam, F, Khatua, S, Siddiqui, Z, Malik, M, Duggal, P, Watson, S, Vollenweider, X. 2020. Using big data for evaluating development outcomes: a systematic map [Online]. 3ie. Available here.

Other related versions of this map (links to the submaps):

Economic development and livelihoods

Health and well-being

Governance and human rights

Urban development

Environmental sustainability