Wednesday, 22 June 2016

Give Me Strength (of Evidence)!

I’m asked a lot about evidence in evaluation. For example, how much evidence does our charity need to prove we’re achieving the outcomes our funders expect? What sample size do we need for our survey to be valid? Some published ‘evidence standards’ (e.g. NESTA, EEF*) don’t really help. The problem is, they focus on statistical methods such as Randomised Controlled Trials, which are way beyond the means of many third sector organisations.

Go back to a basic definition of ‘evidence’: “that which tends to prove or disprove something”. This means it’s less about specific thresholds, more about a collection of findings which together present a convincing case. So think of evidence as a collection of elements if you like – an ‘evidence bag’; the more you have, the better. Some examples:

Theory of Change: In simple terms, this is cause and effect. Can you explain – clearly and convincingly to others – how and why the action you are taking achieves the outcomes you are looking for?

Stories (qualitative feedback): Case studies and other types of narrative feedback are important. Partly because heart-warming stories can influence the public to support fundraising, but also because they can help to prove – through personal accounts – that your Theory of Change is valid.

Hard data (quantitative feedback): These are the surveys or improvement scales that put numbers to what you achieve. Where possible these should cover both the number of people you make a difference to, and extent of improvement they achieve. The umpteen ways of doing this form a subject in themselves, but to begin with, make sure you’re asking relevant questions (the R in SMART – see previous blog).

Oh yes: Sample Size. Yes, there are statistical formulae that allow you to work out ideal sample sizes depending on the ‘confidence level’ you want. But in most situations it’s more important to make sure you’ve got a representative sample from all of the people and groups you’re working with. The failures as well as the successes.

RCTs (Randomised Controlled Trials): The ultimate in hard data is comparison with a control group who have not experienced your activity of support. But in many socially-based situations this is virtually impossible. To be valid, both the subjects and the control group should have nothing else that changes in their lives apart from your “intervention”. And anyway, would you deny support to someone who needs it just so that they can be part of an RCT control group?

Independence: Whether it’s surveys, stories or other feedback, it helps to show they are unbiased. Data collected independently generally has a better chance of avoiding people saying what they think you want to hear, or being too polite to criticise.

Triangulation: This fancy term just means getting evidence from more than one source. If a service user says you’ve made a difference, that’s good. If their GP and their family also say you’ve made a difference, that’s even better!

Finally, don’t forget the most important question of all: what are you going to do with the information? Learning and improvement should be fundamental to any evaluation. A key test of whether you have sufficient evidence is whether it gives you the confidence to make decisions for the future.

*National Endowment for Science, Technology and the Arts: Standards of Evidence, and Education Endowment Foundation: Security of Findings