The Web Analyst Case for Acceleration to Experimentation
Early in my training as a ‘web analyst’, Paul (my boss and the COO from CableOrganizer.com) set out some goals for me which included checkpoints to my actually being considered ‘mature’ in my designation. Those steps included tracking and reporting KPIs and insights on a weekly basis, regularly scheduled usability tests, and the successful completion of a multivariate test. I set my sights on those goals while Paul prepared the road for installing a data decision process in the business.
The purpose of this post is to explain the progression from carbon-based report handler to full scale and skill realized web analyst. It will attempt to point to areas which may help new and developing analysts gain momentum in the perfecting of their craft. Lastly, it will outline a few quick experiments which might help prime the process engine. (If you’re looking for ways to improve your value, I’m handing you the recipe)
For the purpose of support, the testing I will be describing includes:
- A/B or Split Testing Using Google Website Optimizer - testing single variant elements per URL which traffic is directed into by a script. Goal success is then attributed to the page.
- Full Factorial Multivariate Testing with GWO - testing multiple elements per single URL, simultaneously, where the items tested are identified within the page and rendered randomly. Goal success is distributed across the table of elements per variation and aggregated statistically to determine relevance of each element to success and compared to all variations and the original.
- Usability Testing -qualitative test to gain insight into a user session experience on a given site where a professional, random, or representative subject is observed performing tasks and navigating a web site. These can be performed by usability experts in labs, by simple observation and analysis, or through a handful of testing services whom use and relay audio/video files back to the customer.
These experimentation methods, in their simplest form, provide an analyst with a set of tools to validate their assertions. The outcome of GWO tests which are set up correctly and run to completion provide invaluable statistical justification for keeping or replacing elements within a website. Usability testing these same areas and elements should augment the data by providing a qualitative perspective to the findings.
Making experimentation the goal forces developing creative hypotheses. Looking back this seems to be the most essential right of passage into the world of active practice of analytics. Where measurement, assertions, and hypotheses are part of the analytical psychosis; knowing there are systems in place to support or diminish statements forces us to think forward. It is by this means that assertions, testing, and ultimately improvements to the site become more innovative and increase the chances of greater success.
Continued success in testing and measuring site improvements for a primary goal (i.e. - conversion) increases the value of an analyst and their merit as authority among colleagues. It has been my personal experience that the more you test and objectively report complete results, the more weight your contribution is given. Sometimes things do not support your hypothesis and it is equally as important that these results are given to the appropriate people. Should the analyst be lucky enough to be surrounded by highly intelligent peers, the resulting discussion from success or failure from each hypothesis should be equally as fruitful in insights on which to base future hypotheses.

GWO Experiments need not be enormous and complicated from the start. Get into testing by making up four or five alternative headlines for a high traffic page. Try each of the testing methodologies. Here’s a quick test to try just to get the mechanics down:
- Identify and analyze a page for testing with decent traffic and lackluster performance. (*This will help benchmark performance to understand impact more clearly.)
- Create four or five suitable alternatives (with at least one marginally poor headline to create divergence).
- Make an appropriate number of copies of pages to match the number of variant headlines
- Rename pages of variations and supply new URLs to Google
- Install GWO provided scripts on original page
- Install GWO provided script on variant pages
- Install GWO conversion goal script on goal page
- Test the scripts
- Execute test
After a few days, or weeks, depending on the level of traffic and the apparent difference in your variations, you should experience some divergence which can begin to allude to validating and supporting, or, possibly diminishing your claim. (again, regardless of the outcome, so long as you have clear data, the test should be considered a success)
(I’ll publish an edit with some photos and some tips on usability here when I have the time.)
Getting a test under your belt can be an enormous benefit. Just knowing you can perform a test makes you think differently. That aspect of perspective is a huge step in getting to where the real analysis takes place.
Daniel added the following ...
Hey Bill. Thanks for the comment.
What you’ve stated is true…and I agree with you. The purpose for this is to illustrate to a web analyst(whom may have relatively low frequency or experience with multivariate testing) the ability to create an understanding of how divergence can impact the test through the comparison of the elements being tested. Also, it might be a point of interest to people who’ve already been testing but unsure as to why their experiments are taking forever to reach statistical significance. GWO does not ‘complete’ a test until the relevance rating has reached a certain threshold of validity. Building in a dog element can help foster expedience.
In addition, to respond to the latter statement, if all the tested elements are very strong, the divergence will occur versus the original.
Again, thanks for the note.
Daniel

Billy Shih added the following ...
I love this sentence:
“Should the analyst be lucky enough to be surrounded by highly intelligent peers, the resulting discussion from success or failure from each hypothesis should be equally as fruitful in insights on which to base future hypotheses.”
It’s often hard to convince people that results, whether their spectacular or lukewarm, always teach you something that can help you improve the page.
I’m curious about this statement though:
-Create four or five suitable alternatives (with at least one marginally poor headline to create divergence).
Why is it necessary to create divergence? And why with a marginally poor headline?
I believe if you test sufficiently different headlines, you will create some divergence regardless. Still, I don’t see how if all of the titles are good, why that would negatively impact the test or your results?