TO INCLUDE OR NOT TO INCLUDE VISITORS? THAT IS THE QUESTION...

Our game-changing new data pipeline and reporting suite, Discovery, easily allows post-test exclusion of certain users and traffic which can reveal some really important insights that might usually get overlooked - or lost in the fog.

In this post I wanted to have a brief look at just 3 interesting use cases that shine a light on how you can use these ‘exclusion filters’ to get a deeper understanding out of your test results.

1. Excluding traffic based on origin

I've been in experimentation a long time and have frequently seen people run expensive PPC and display campaigns, which sometimes can bring an influx of terribly converting traffic to the website.

There's an argument that tests should do their best to convert these too, but the user behaviour can be so drastically different from your ‘regular’ traffic when you’re trying to analyse an experiment that this becomes very difficult.

If you know these things are happening, excluding specifically those campaigns, those origins, those users - things we'd consider ‘noise’ - means you're flattening the base for users we analyse.

Compare it to taking standard deviations. Noise is bad. Outliers aren't helpful for analysis. If poor converting traffic is temporary, don't let the pollution pull down your test results.

2. Excluding certain browsers and device types

When recently analysing which browsers users were utilising for one of our partners, we found that users were buying tickets (and falling into tests) in their Tesla cars and on Oculus VR headsets.

These are clearly not the ‘standard’ things you would normally QA for, but devices have flown under the radar and got into tests.

We could just let them pollute our data set, but after exploring performance - and understanding it wasn't great (an opportunity we should talk about!), we decided to exclude these edge cases on the fly.

Browsers are another similar example, people notoriously don't bother QAing their tests on older browsers or on IE, etc.

If things don't go well, should these tiny proportion of users ruin your overall test results? Or should you go back to QA and then to bugfix and re-run the test to see what happened?

Alternatively, excluding them is easy, as long as the reporting of the results aren't extrapolated beyond what you're analysing.

3. Bot users

We’ve seen this increasing over the years and from what I hear it’s cropping up quite a lot in other areas too (UA to GA4 migrations and such like).

When one of our clients reported to us that their data in our reporting was not matching with Google Analytics, we investigated.

After a deep exploration of Google, we found that there were a considerable number of robots which we were capturing in Webtrends Optimize, who were blocking Google Analytics from seeing them.

It's easy for anyone to block GA when building bots – in fact, all the bots need to do is write window['ga-disable-UA-XXXXXX-Y'] = true; to the page and they're set. Or, perhaps even more easily, block network traffic to/from google (or all 3rd party domains).

We often find ourselves flying under the radar of these bots, having changed our domain from webtrends.com to webtrends-optimize.com when we separated, as well as delivering a few services through generic names our cloud hosting provides, like workers.dev from Cloudflare and azurewebsites.net from Microsoft Azure.

Because of this, bots often don't recognise us and therefore don't know to block us.

The first, most important step with Discovery is being able to identify these bots - access to relevant data, extractable, widgets to alert you to IPs with a large footprint, etc.

The second is the ability to nuke them on the fly. We can do that! So, the net result is once again removing the noise, leaving you with clean and real-user data to report back on.

To summarise…

Not all data belongs in your analysis.

Data cleansing is hugely important to running good experiments, finding genuine uplifts, and is the path through personalisation which most companies aim for at some point - if they’re not already knee-deep in it.

Find it. Understand it. And decide if it is relevant to what you’re trying to test.

For a demo of Webtrends Optimize Discovery, and/or our full-stack, all-inclusive experimentation platform as a whole reach out and we’d be happy to run through it with you.