My data Vs your data … why do they differ?

Data is known as an unquestionable fact! Everyone uses reporting to understand what is happening on their site to then understand why it is happening and what actions to take. However, the way data is presented can differ massively depending on the criteria of each reporting system. For example, wonder why your data in Google Analytics differs from what you see in your CRM and Webtrends Optimize (WTO) reporting data?

Data from Testing Scenarios

This is quite frequent, and the main culprit lies in the criteria and segmentation rules. When looking at Webtrends Optimize testing data, clients do forget that the experimentation data is just sampling a portion of their data and this is dependent on throttling percentage, device and user attributes, nonetheless, there are other factors to consider as well.

Accuracy of Comparison

What is important to have in mind, is that you will rarely have the same data given from two different sources. They could be similar but will never be 100% matching. Generally, when the difference is above the 10% threshold, hunch says, you should investigate!

What are the causes or areas to consider?

Server-side validation

If an event triggers on a simple button click rather than after a server-side validation, then most likely that event will be inflated in comparison with a click that is server side validated. This is very common on form submit buttons.

Server Time-zone settings

At Webtrends Optimize, all events are automatically stored with a time stamp. They are stored in our database with a coordinated universal time stamp (UTC). Because data comes from a variety of systems, you may see a slightly different result when running queries for the past 24-48 hours from our platform and Google Analytics, for example. This could be due to processing data at different times.

Users, unique page views and sessions – how we define them

In both GA and Webtrends Optimize a user is defined in a similar way via a tracking ID, unique to the visitors browser. These counts should be similar.

However there are differences that can spring up when looking at further reporting. Unique Page Views for instance. In WTO, a unique page view is a single event for each unique visitor ID for the duration of the test. Multiple visits to the same page are counted just the once.

In GA, “Unique Page views” actually the number of unique page views per session’. This means that for returning visitors, a visit to a previously visited page on a latter session will increase the count . So for example, a visitor may have visited the same page of your site once in the morning and once in the afternoon, because the user will be counted in 2 sessions the page view will be tracked as 2 unique page views.

Neither calculation is “wrong”. Just simply a difference in definitions related to the underlying purposes of each platform.

Here at Webtrends Optimize we offer integrations with GA so that you can view GA counts by experiment if needed.

Segmentation rules

These are many, however the main ones are these:

Targeting A/B tests to new or only returned users. even in this instance, what one reporting system considers a new user can be different to what another reporting system considers a new user.
Specific locations, e.g. UK site only and London area only. Again, the mile radius or even region boundaries may differ slightly between reporting systems.
Specific devices and browsers: most A/B testing occur on devices and browsers that have the most traffic and not all devices and browsers, (why delay a launch just to build and QA a device/browser with such low traffic).

Throttling

Most A/Bn tests start with a low throttle the first 1-4 days until we’re happy that all conversions, transformations and scenarios execute as planned. Then looking at data, we need to take these first days into consideration.

BOT exclusion lists

Another potential source of inaccurate data is bot traffic. Bots, spiders and crawlers are software applications that run automated tasks on the internet. As everywhere, there are good bots – search engine crawlers, for example (which are generally excluded from Google Analytics) and there are nasty bots – spamming, content scraping, malware distribution, to name a few. The problem with these ones is that they are very good at imitating human behaviour.

Although Google Analytics offers Bot filtering ‘Exclude all hits from known bots and spiders’, it still isn’t considered as a bot blocking tool.

With Webtrends Optimize AB testing, we build out specific segments based on information we know about good and bad bots to make sure any bad bots are excluded from our tests. Therefore our data may look a little smaller and change the result reading due to us only allowing true customers and their behaviour to be tracked against our hypothesis KPIs.

In the end, with any reporting system, the trends and changes rather than absolute numbers are more critical in establishing insight.

What about you, have you experienced any differentiation in data that was not mentioned in this blog? We’d love to hear about it!

MY DATA VS YOUR DATA … WHY DO THEY DIFFER?