A test and learn framework you can rely on to get results
The only thing worse than not testing at all is launching a test you can’t rely on. What use are fractured takeaways to any digital strategy?
As digital fundraising experts, it’s our responsibility to ensure every penny of spend delivers results — and that return on investment is as optimal as possible.
If tests aren’t planned properly, they can lead to a dead end, and a waste of money – be that due to statistical insignificance or non-actionable takeaways. But done well, with planning and intent, they can cut through opinion, aid growth and center decision-making.
If you’re planning a test and learn strategy, take a look at this framework to ensure your activity is watertight, intentional and leads to impact.
1. Define what you actually need to learn to deliver results
Firstly, define your testing ambitions - ensuring they’re prioritized with revenue generation in mind. A paid-for test may not always be needed if the insight lives elsewhere.
Conduct an insight-gap analysis. To define what you need to learn, you first need to map out what’s already known. A digital workshop can be a good place to start, where people outline the things they think they need to learn to drive impact. In some cases, the answer to the research question posed may be obvious in which case blockers should first be explored as to why it’s not being actioned: is the issue a lack of data, or stakeholder opinion holding back progress? In some cases, data might be needed to cut through opinion. In others, simple alignment can do the trick. If the answer isn't obvious, it might be the answer already exists in the organization and before spending you should ask colleagues what they already know. This process can take unnecessary research questions off the table quickly while aiding knowledge sharing.
Prioritize your gaps with ROI in mind. Once a list of potential areas for exploration are refined, they should be prioritised. There are varying ways of doing this, but I often use a quadrant that considers (1) if they’re going to have high or low impact on daily operations, and (2) if the impact of the insight would be short, medium or long term. High impact findings with immediate application are a good place to start. It’s also important to consider the level of lift that’s required to put the hypothetical insights into action: if a high impact finding will take years to implement, are there more immediate ones that require short-term focus in your testing cycle?
Ensure the answer can’t be gathered for free. Testing doesn’t always need hefty budgets. Other options are (1) longer-term A/B tests that gather results over months not days; (2) using owned channels such as social and email as a testing ground; (3) sending surveys to owned email lists or segments of your database; (4) asking contacts in your sector to knowledge share; or (5) using tools such as social listening to get answers, if you’ve got a license. It may be obvious, but it’s also a good idea to learn from competitors: you can be sure they’re testing and implementing results, so if they’ve already done the hard work, why not learn from them instead of testing something that’s going to tell the same thing?
2. Determine a tailored testing methodology
Once a prioritized list of high impact insight gaps to fill are defined, the next stage is defining a method to learn them. It’s important to select the right platform, create variants that enable insight generation, and spend wisely with significance in mind.
Select the most effective method. While paid media is often the go-to for test and learn campaigns, targeted surveys and focus groups can be effective. Each has strengths and weaknesses, so it’s important to be mindful of the type and scale of insight you need to iterate and evolve. While paid surveys can be a great way to get a range of quantitative and qualitative insight at scale, focus groups can delve into emotions in a way only human interaction can leverage. Paid media can be an effective way for determining growth plans and to test themes, messages and journeys - but can be expensive if you’re looking for conversion significance.
If in doubt, test opposites. You want to avoid two variants being fairly similar and generating fractional uplift that raises eyebrows. Make sure the gap between your variants is wide enough so you can see how either end of the spectrum performs. Instead of testing a low verus a medium ask, try a low versus high ask - then in the second iteration, test a level deeper. It’s likely that by widening the gap you’ll find the lower-performing variant secures fewer results, and will need to spend more to achieve significance. Instead, if the gap is wide enough, you can look at the differential and consider higher-funnel metrics such as journey starts or click-throughs as a proxy metric.
Right-size your budget and significance. It’s not true that tests need to be live for a set period of time to get results. It’s about the volume of results and whether they achieve the significance you need to draw conclusions. Survey Monkey’s statistical significance calculator can help you learn the volume of results you need to form reliable data: if you’ve got a total audience of 10 million and want 95% confidence with a 10% margin of error you need 100 results per variant. If you’re happy with the caveats, there’s no need to go above this volume, meaning you can cap spend at this volume of results. Don’t triple spend for the sake of tripling spend or because your agency’s advising it. If this proves expensive, look to higher-funnel results like journey starts or click-throughs as a proxy metric which can be a fraction of the cost. It won’t give you the same confidence, but it will give you enough to build from if budgets are tight.
3. Give your test space, but be ready to react
It’s crucial to let a test run without interference to avoid bias entering the equation which could invalidate findings. But that’s not to say you can’t react quickly if the test clearly isn’t working: sometimes conserving spend is a sensible step.
Check but don’t interfere with the test. Fiercely QA the test before it launches: the last thing you want is to launch a test that’s not tracking correctly, meaning you can’t define clear takeaways. Once it’s checked and live, leave it to do its job. For paid media, it’s crucial not to interfere with the campaign, creative or ad set at all once it’s launched as this will send things back into learning, limiting the reliability of the insights it generates.
Monitor your results, within reason. Once it’s launched, keep an eye on the data to triple check tracking is feeding through. It’s better to diagnose live issues early, fix them and relaunch than letting it run with errors. Don’t be alarmed if performance is low at first — things usually start to pick up once campaigns have gathered some results and exited their learning phase. Usually, around 30 results are needed before the learning phase is exited but it can vary.
Pause if it’s not working and upscale if it is. If you’re 25% of the way through your spend and haven’t achieved a fraction of what you’d hoped, the test probably isn’t going to turn around and you may want to conserve the rest of your spend. Inversely, if the test is working and you’re seeing strong return on investment and a clear opportunity you can rely on, you may want to react and upscale to drive results at scale. Not only will this look great for test potential, it’s a quick way to achieve your desired outcomes. Be sure to grab the result data you need before switching gears so you’ve got the test results before upscaling, at which point performance may vary.
4. Communicate your results clearly and with relevance
When you’re sharing back your findings, tell the story of what you found in a way that makes sense. Not everyone gets testing, data or insight: while one audience may appreciate complex data visualisation, another may want to know what they need to do as a result.
Identify your audiences. Map out the people you need to buy into the results for ongoing action to take place – considering what you need them to think, feel and do as a result. It’s likely to vary by person: while one person may simply be glad for the insight and ready to be guided, others may need more convincing. In the latter scenario, you’ll be relieved for the statistical significance and structure you put in place at the start – data and insight can be a way of cutting through blockers and opinions if you’ve got the confidence to back them up.
Consider digital and data literacy when presenting. After all your hard work the last thing you want is to alienate your audiences by confusing them with complex data and charts that distract. Ensure you’re lifting out the clear story and considering the ‘so what?’ throughout, so insight always ladders to action.
Showcase your findings more widely. It’s easier to generate a culture of testing and learning when there’s wider buy-in around the organisation. Once the core stakeholders are won round, share your findings with other teams to ensure insights don’t sit in silo and that the next test is easier to get over the line. Ensure you focus on the long-term benefits and next steps to show how testing leads to evolution and longer-term innovation that drives progress.
5. Evolve and iterate
Once you’ve defined your insights the next step is to put your findings into practise: turning insight into action. Measure the uplift in overall performance once your insight’s been rolled out wider, showing the impact of testing to secure buy-in and show the financial return of innovation.
Put your findings into practise. Implement the relevant actions from your test, clearly mapping out the impact it has when applied more widely. It’s crucial tests lead to action to ensure you’re not testing for the sake of testing. If there’s more to be learned, consider what a phase two looks like to take things to the next level - taking iterations of your best performing variants through.
Go deeper before testing something else, if it makes sense. It’s easy to get eager to move onto the next big thing, but sometimes, there’s more insight to be learned. Consider what subsidiary tests may look like, then test them first, if they are likely to have a higher level of impact than a new test.
Monitor the wider impact on performance. Once enough time has passed for you to quantify the uplift the test finding had when implemented more widely, shout about it to show the long-term impact testing has on innovation and growth.