To tell somebody that he/she is wrong is called criticism. To do so professionally is called QA testing.

Quality, and Testing, play a very important part in Optimisation, just like any other phase of the product development cycle. The quality of our work can’t be ignored, nor have shortcomings – your experiments hold just as important a role in the web/app experience as everything else you deploy. So, as we see it:

If a test is not working, does it matter how fast it was built?

Accurate and comprehensive QA testing plays a role in the results of a test, just as the ideas themselves do. Experiments that generate a poor experience could hurt the results of your experiment – something we often hear from prospects we talk to who have had to deal with content flickering.

Your A/B tests could be considered of poor quality for any of the following reasons

  • Building the wrong thing: Poor communication and documentation can lead to developers misinterpreting your requirements and building the wrong thing. An obvious failure, where co-ordination is important.
  • It looks skewed: The more obvious thing that people test for – does it look broken or ‘wonky’?
  • Having incorrect functionality: What happens when users interact with your new elements? Should the test differ in any scenarios?
  • Hurting existing elements: Your new content might be just fine, but how do the surrounding elements behave? Do they continue to look good and function well? For example, your new sticky Add-to-bag button shouldn’t break the navigation.
  • There’s content flickering: Do users see the old content, a flash, and then new content? How badly does this hurt your onsite/in-app experience?
  • Incorrect data collection: The key to all reporting is successful data collection, but do you check each metric as it fires and is collected? If things fail to capture, for all users or some, do you understand why? Have you checked click events on mobile phones? Do events trigger when you don’t mean for them too? These negative test cases, as well as positive ones, are important.
  • URLs missing query parameters: If you’re redirecting a user and it happens before Analytics tools fire, are you persisting the query string parameters so campaign traffic still gets counted?

So, if you know what could go wrong, specifically what should you be checking for? After experiment development is complete, we split our checks into three categories. A detailed list of things to consider for each section are as follows. We know it’s comprehensive, but that’s entirely the point!


  • Check all on-page elements, not just what you’re changing. Search bars, navigation, headers, lightboxes (whenever they show), etc. Knowing all dynamic elements on the page beforehand certainly helps with this, both for Dev and Quality Assurance teams.
  • Device types & responsive design: Know how things should look across all resolutions. Don’t just check for the best-case resolutions, but also consider the awkward in-between ones. Having a good idea of screen resolutions used by your users helps to prioritise fixes, although all bugs should be resolved where possible.
  • Image loading: If images are being changed, are they aligning with the design? Are they an appropriate size, given “picture” tags allow for various sizes based on resolution? Do you see the old image before the new one loads in?
  • Form fields: Can you still submit on-page forms? Does the validation messaging render correctly? Is everything correctly aligned?
  • Margins, borders and alignment: Make sure things align correctly, and that the spacing between elements is as you’d expect. It’s easy to add unnecessary white space (or not enough), so make sure you compare against mock-ups/prototypes correctly.
  • Spelling, punctuation and grammar: Ensure consistency for capitalisation, punctuation, quotation marks, hyphens, etc. You should have the text in your Project Plans (not just embedded in a mock-up) beforehand to make this a simple copy/paste job for the dev. But even then, mistakes happen.
  • Typography: Check font styles, sizes, weight, variant, colour etc. to make sure they’re accurate. Chrome Dev Tools is a great resource to help understand the page better for this.
  • Element states: The hover state of links and buttons, how text behaves if you’re a new vs. returning user, what banners or merchandising people see based on their browsing history, etc. – again, dynamic behaviour should be noted and accounted for.

Functionality & Experience

  • Obvious functionality: Do click/hover/other interactions work as expected? What if you interact with nearby or similar elements (similar element names, classnames, etc.? Are you triggering anything by mistake?)
  • Related functionality: If you’re hooking into click behaviours, are you breaking existing functionality? This applies not just to clicks, but onscroll events, exit intent movements, etc. too. Being able to read your dev’s source code really helps here so you know what you’re looking for, e.g. use of Element.onclick instead of addEventHandler.
  • Surrounding elements: Check navigation, image galleries, etc. to make sure they still function.
  • Hidden sections: Form fields, accordions, tabs etc. can all reveal additional content – make sure these work and that the hidden content works as expected.
  • External page interference: Make sure the test and related code only executes on the pages you intend for them to.
  • Submissions and links: Do they work, and do they take you where you expect to go? Links especially can be easy to break.
  • JS errors and warnings: Are any being thrown? What about in different variations, or when you’re not previewing the test? If there’s an increase in these, it may be because of your test – you could break other people’s code whilst yours continues to work.
  • Performance: Does the test load reasonably quickly? Any noticeable delays?

Reporting & Conversions

  • Does it work? Do all clicks, page-loads, submissions, custom events and custom data metrics track, and match the Project plan? The positive use-case of expectations matching reality.
  • Unintended tracking? Try running CSS selectors in the console – are the elements returned the only ones you wanted to catch? What about regular expressions – are they perhaps not tight enough? Grab a bunch of site URLs and run them through a tool like RegEx Pal to make sure. This applies to where your test runs, as well as where conversions track – are you running your test on unintended pages? Will you break them, or capture views where you’re not making the intended change?
  • Analytics integrations: Does the data roll up into the analytics tool too, if integrated? Having that second source of data is a useful thing to be sure of beforehand.
  • Don’t forget the control: It’s easy to track your metrics for your Variants, but don’t forget about their equivalent for the Control group. That’s just as important to make sure metrics like Lift Over Control are meaningful.

Whilst this may seem like a lot, we believe the process should resemble the depth of testing you may do when releasing a new version of your website/app, which can often take days and considerable amounts of effort.

Chris Kennedy, our Services Director, notes:

“Quality can be the silent killer of your experimentation programme. Any deviation from what is intended can adversely handicap or advantage an experiment and mis-lead. Whether it’s a quick text change or a huge data engineering project we don’t allow any wiggle room. We thoroughly check our changes over a number of devices and scenarios to ensure the integrity of our results is beyond reproach.”

That’s it. Should be a breeze, right? Keep in mind that if you are this thorough, you too can build bulletproof tests which stand the best chance of success.

In our next post, we’ll be looking more into the strategy of how to approach testing – when, how, who to involve, etc. Keep an eye out for it, or follow us on LinkedIn and we’ll let you know when it’s released!