How to inject new growth into plateauing CRO programmes
Watch a re-run of our Fresh Thinking Live! CRO webinar from 20/05/2020.
Watch the discussion where the panel discussed a range of CRO topics including, common performance plateau scenarios, the local vs global maxima concept, how to use different techniques to generate new ideas to impact your metrics and iteration-testing to your different segments.
Joining Duncan Heath in the webinar were:
- Victoria Stead - digital optimisation manager, The Open University
- Paul Knutton - Global consulting practice director, Monetate
Watch the 45-minute session in its entirety and read key questions and answers from the session below.
Key questions and answers from the webinar.
- What advice can the panel share on striking a successful balance between compliance and effective UX?
- How do you maintain momentum with testing when traffic levels are low?
- What is your view on so-called high-velocity testing and the stance that it is generally a good approach i.e. increasing test rate as a KPI of the programme to avoid plateaus and where do you see this being a recommended approach?
- What size audience did you test the OU split on, or what amount of traffic would be the minimum required to get usable info?
- So the next part to question 4, what is the amount of traffic, the minimum required to get usable info would you say?
- How would you recommend handling the death of third-party cookie or ITP2, especially in regards to personalisation?
- How can you benchmark A/B testing when COVID-19 has dramatically increased or decreased conversion rates?
- Why don't I see a difference in my analytics? When I see a 5% uptick in the split testing tool, I expect a 5% uptick in analytics, why does this not translate?
- How do you present your results – as absolutes or as a range?
1. What advice can the panel share on striking a successful balance between compliance and effective UX?
Victoria Stead: So we quite often have you know business needs that are related to compliance. What I found is it's a lot to do with really just stakeholder, stakeholder engagement, and helping those people perhaps in your finance department. Or something like that understand that even if something that you're presenting some content you're presenting is compliant.
If the user doesn't understand it or doesn't engage with it, then it's going to cause a problem at some point in the journey. So I think it's about collaboration and building trust with those stakeholders, sharing that qualitative insight that you've got, be it qualitative or quantitative, it's often qualitative, isn't it of how someone is engaging with content on the page, and just you know we've had success, and really working, collaborating and looking at that the kind of small print that you have to include going back with saying, oh, can we tweak it and present it in this different way?
Bit of back and forth, and really helping them understand the impacts that if you're not user-centred, the impact that it has on your business, business goals.
2. How do you maintain momentum with testing when traffic levels are low?
Paul Knutton: So traditional A/B testing, you know, that's based on classical statistical models and uses those cool groovy things like statistical significance and confidence and we've got that at Monetate, but you know, you'll need volume to ensure that you get this statistical significance and a confident result.
So at Monetate, we have standard testing that you'll need a certain amount of traffic, if you look at areas where you might not get enough traffic, or it might take too long to get a result, we've also got dynamic testing capability.
So this uses Bayesian logic. So this makes decisions under uncertainty, and it adjusts the traffic in real-time to the winning variant. And that's an ideal in areas of your site with lower traffic and it minimises your cost of learning when compared to a traditional A/B test.
3. What is your view on so-called high-velocity testing and the stance that it is generally a good approach i.e. increasing test rate as a KPI of the programme to avoid plateaus and where do you see this being a recommended approach?
Duncan Heath: So this idea, this equation, that the test velocity multiplied by win rate equals growth rate is quite well publicised, and there are two sides that equation - quantity and quality. What people often take from that equation though, is that if they can increase their test velocity they'll see greater returns. They often ignore the other half of the equation or assume that basically the win rate will stay the same.
If you're running poor tests, increasing the velocity of that, it is just going you have more crap test being shipped ultimately and probably more false positives as well. So you really don't want to be scaling that up. If you've got a good win rate, assuming you're able to keep that going when you increase velocity is often wishful thinking as well.
I think good tests when you look at them, require good research, good creative sessions, good design, good test build, good analysis as well, and all those things take time really. So don't think that by speeding everything up, that you'll maintain that win rate. It is likely wishful thinking. And when I say win rate, I am including in that obviously getting good learning outcomes, not just that immediate KPI impact.
Now having said that, there are obviously brands that have great success stories of that high-velocity testing. So the likes of Airbnb and Booking.com probably being the most well-known. They are inspirational and should be listened to, of course. But those companies have really got the resources, the team sizes, and the traffic to allow that kind of testing. They've got what we call high test-capacity. So I would say is that test velocity is a good variable, and you can influence it, and you should be actively trying to increase your test velocity over time, but only at the right time, and when you have the team to support it, and you're not going to jeopardise that, that test quality and win rate.
Most companies that I work with really need to focus on improving the quality of their test based on the quality research and try to influence the size of the impact of their tests, more probably than an increase in the velocity of them. Focus on velocity if you can, but not to the detriment of the other half of that equation.
Paul Knutton: A good plan, a good prioritised plan and stick to it, process.
4. What size audience did you test the OU split on, or what amount of traffic would be the minimum required to get usable info?
Victoria Stead: If we are talking about the original qualifications that qualified the original test, then it was several hundred thousand visits.
5. So the next part that, what is the amount of traffic, the minimum required to get usable info would you say?
Victoria Stead: We put it into a calculator and check, it's our 'cheats' way of doing it, it depends on whether it's quantitative or a qualitative data you're trying to get. But assuming that is the quantitative part, then statistical significance is one criteria.
Duncan Heath: But then you're also looking at whether that test allows for the business cycles that naturally occur and also, of course, that you're not still getting flux in terms of the results in the test. So you've got some steady results for a decent period of time. But ultimately, we're testing to a 95% significance and 80% power.
6. How would you recommend handling the death of third party cookie or ITP2, especially in regards to personalisation?
Paul Knutton: I love an ITP question. So I could get quite into it, so stop me if I do. So ITP2, and I think 2.1, we can put the cookie into local storage and that problem goes away. I think ITP2.2, you can't do that.
So Monetate relies on cookies, but we also store the personalised data against a customer. If we get that customer ID event and if they identify in the future, we don't lose any of the personalisation data.
7. How can you benchmark A/B testing when COVID-19 has dramatically increased or decreased conversion rates?
Paul Knutton: So you know we're testing at the moment in week 10 of a global pandemic, so we know what works during week 10 of a global pandemic, which hopefully won't be that useful in the future.
But I think it depends what type of test you doing, and if you do like a new UX test, when you changing the wording on the mini-basket button, I don't think it matters if it's week 10 of a global pandemic or not, it's human behaviour, and that's pretty consistent amongst us at the time of performing an action.
What we're also seeing at Monetate right now, there's a lot of brands that are going away from testing, and they're actually going towards driving revenue, it's about 'right now', over learning, and then, more and more using machine learning and dynamic testing capability.
So adjust the traffic in real-time towards the winning variant and that reduces any risk, and it also adjusts in real-time. So if things change, then it will adjust the traffic accordingly, and it will maximise the revenue, you will get some learnings out of it.
But depending on the type of test it depends how much you want to rely on those learnings.
Victoria Stead: I just echo that with The Open University we are getting a lot of, you know, quite big increases in traffic as a result of the circumstances. There are still lots of things that we need to do to improve the experience, we just have a bigger audience to improve that experience with and for. So we are busier than ever at the moment.
8. Why don't I see a difference in my analytics? When I see a 5% uptick in the split testing tool, I expect a 5% uptick in analytics, why does this not translate?
Duncan Heath: I can say from my experience, that firstly, there is a bit of a misunderstanding quite often, that a test that reaches 95% significance is going to deliver the impact that is measured during that testing period - So let's say your test, you see a 20% increase in the conversion rate of enquiries, that it will translate into 20% increase in enquiries when you set it live.
We're actually not saying that it's 95% confident and that will bring a 20% increase. You're just 95% confident that the variation is going to beat the original by some amount.
So there's obviously a need to think about this is during the testing period, but is it replicable ongoing? There's a case for you running tests at numerous times over different periods to see if it repeats and you get the same kind of results.
What I would typically do is with very, very low impact tests, so you're talking, well, let's say a few percentage points (that translates to minimal value, but worth putting live) trying to scrutinise that a bit more, maybe run it for longer than you expect to need to run it for to get additional data to make sure that actually that confidence that you saw originally continues ongoing.
I'd also say that with those larger tests that you're working on, don't expect the exact result to translate into your impact. So if it's 20% increase, there is a standard deviation, let's say, 10%, then that is a range. We may present the worst-case scenario, based on the minimum standard deviation of this, or this is what we expect to achieve if everything occurred as per the testing period.
It might be at 20%. But let's be realistic, let's take that minimum level and have a worst-case scenario, and more often than not, it then translates into your ultimate results a bit more and more directly.
Victoria Stead: It's a mix I suppose, the way we're sharing our results. And initially, I was sharing quite a lot of simplistic results, which would encourage people I guess to leap to the conclusion that, that means we'll carry on with uplift throughout. So we started being a bit clearer, still trying to keep the key message, so people don't get lost in the statistics, but trying to present that in confidence levels when we're presented by the results from sharing results from tests, and that is helping.
9. How do you present your results – as absolutes or as a range?
Duncan Heath: It really depends on who you are presenting the results to. If it’s a senior stakeholder outside of the optimisation team then typically a conservative absolute figure will be presented. If it’s someone with a bit more optimisation understanding, and who would benefit from seeing the performance range, it will be presented as such.
Links related to the webinar
- Monetate - Explore how Google data can be used to tell stories
- The Open Univerisity - Distance learning courses and adult learning
- How to choose the right user research method - This guide will help you map your user research goals and questions to the right methods.
- Personalisation tools comparison guide - Download this FREE guide for detailed reviews and comparisons of the best personalisation tools.