A new paper published by Making All Voices Count sheds light on existing evidence about the Random Controlled Trials (RCTS) in the evaluation of social accountability tools in service delivery.
Where and under what conditions might RCTs be the most appropriate approach? What other evaluation approaches would be more effective, more robust, given the particular characteristics of T&A programmes? The paper’s author, Dr Jennifer Leavy explains:
To date, there have been relatively few impact evaluations (IEs) of Transparency and Accountability (T&A) programmes, despite the amount of donor funding and attention given to the field. Evaluations tend either to be concentrated in certain sectors and countries, making it difficult to make generalisations, or are in very early stages themselves and therefore too soon for lessons to be drawn. For technology-based initiatives, the pool of evaluations is even smaller.
While RCTs are considered by some to be the most robust way of measuring impact, others question their validity. In the medical sector and natural sciences, RCTs are regarded as the ‘gold standard’ in establishing a plausible, unbiased counterfactual against which to measure the impact of an intervention. However critics in other fields argue that experimental methods such as RCTs are not necessarily appropriate in all circumstances.
Making All Voices Count has been trying to understand the potential usefulness of RCT approaches within the programme and similar interventions. To find out more about the applicability and usefulness of RCTs in T&A evaluation, we reviewed 15 relevant RCT studies and analysed them against a set of guiding questions relating both to quality of design and methods and to the specific characteristics of T&A programmes.
Our observations suggest that:
- Overall, RCTs do not work in evaluating impact in terms of political transformation. Neither do they work as a sole method of evaluation where a programme is taking an adaptive and iterative approach.
- As an evaluative tool, RCTs can be an effective and useful means of deciding between different variations in an intervention design in the context of piloting a programme.
- In programme design, implementation has to be randomisable and there need to be exclusionary factors. This can be difficult in T&A especially if policies apply nationally, for example, or regionally, or at other administrative levels for implementation.
- Consideration of context is also limited. If context is not taken adequately into account then evaluations are of limited external validity as we cannot be sure if they would work if scaled up.
- A clearly defined and well-articulated theory of change is necessary – both for effective programme implementation and evaluation of impact. This could be instituted as a crucial and evolving tool in design and planning, as a framework for learning and for assessing and managing risk.
Overall, an RCT approach does lend itself well to technology-based initiatives, even given the challenges in the T&A context described in detail in this and other reviews. However, the value of this exercise needs to be balanced against cost. For small, one-off, low-cost interventions the generally high cost of doing a high-quality RCT could preclude its use in evaluation. But if the programme is intended to be scaled up – say to national or even regional level – it could well be worth the investment.
Our review also suggests that an RCT design for T&A evaluation needs to be improved by:
- Longer time frames between implementation and endline evaluation to allow sufficient time for impacts to be manifest, especially where technology is relatively new to users with little known about their propensity to take it up;
- Evaluations at intermediate stages of the implementation process – midline – in order to gauge intermediate impacts;
- Overall design based on a range of methods – qualitative and quantitative, experimental and non-experimental – to complement the RCT component;
- Clearly articulated theories of change, in order to ensure underlying models are correctly specified and to help identify the most appropriate ‘package’ of methods.
Read here.‘s full research paper