Advanced MVTesting: Site-wide Elements Testing in Google Website Optimizer
Many sites are built around templates or around a system of reproducing common elements throughout the established navigation theme. The components of these site-wide elements are often very important elements contributing to conversion but easy to ignore from a multivariate testing perspective. The purpose of this post is to explain an apparent problem in test development which, when countered with an offered method, may help produce great gains. This methodology was developed for a site in a retail setting and can be reproduced effectively, with sufficient traffic, in a relatively short time.
Google Website Optimizer is my free multivariate tool of choice. Personally, it has not let me down. There are times when it takes some creative leaps to uncover the appropriate methodology, but I have yet to run into a situation where I cannot produce the test I need by the means provided. The testing described here is what I would consider VERY ADVANCED. I say this because it requires a high level of tool familiarity, a solid understanding of the available approaches, site operations/mid-level developer programming skills, and significant pre-test documentation and elements preparation.
Step 1. Choose your elements and variations wisely.
There are lists of items which exist for the purpose of placing relative emphasis on certain global elements of a page which users value and which contribute, in part, to the conversion goal of a site. These usually include graphic and text link calls-to-action, application cues, logos, headlines and taglines, trust elements, brand affiliations, unique selling propositions, and rotational zones. Attach some custom tracking to each of these to quickly uncover the relative value of each of these as they perform in the context of each presentation within the path. Uncover those most highly related to your conversion goal across a variety of paths and plan your test around these.
Step 2: Consult Best Practices
Do the research to find out what are common and best practices for each of the elements which have been designated as having a positive correlation to the conversion metric. Apply several variations of top practices to each of the elements. Isolate them, prepare the variations, install them where they need to be and test them prior to incorporating them into the set up of the test. Ensure that each fits the space exactly as it should be described in code. Copy and paste all the backside code into very comprehensive tables for the purpose of saving time on re-setup as well as helping for organizational and statistical modeling purposes.
Step 3. Set Up Google Optimizer Framework
If you’ve used Google Website Optimizer, you’re probably aware of how the tagging works. For global elements, those which occur on every page throughout the site, you will want to ensure that the script snippets are placed as close to the top and bottom of the OUTPUT HTML as possible. This means they may have to go into your template, or your dynamic page framework as necessary. This may include placing them inside of a PHP file or some other server side application execution file. As long as they appear correctly in the output, this won’t create a problem.
Step 4. Set Conversion Goal
You probably already have a page which you consider your ultimate indicator of final conversion. For a retailer this is usually the thank you page. There you will install your conversion goals tag. In some cases, where traffic might seem deficient or other obstacles exist, it might be advisable to come up with a conversion proxy. Sometimes this discussion sets off purists who feel correlation success is not enough to provide clear relevance to testing outcomes. For our purposes, so long as traffic in UNIQUE VISITORS is high enough to produce statistical validity over 4-5 weeks, the closer to actual conversion the better (try cart additions, or shipping page if not).
Step 5. Install Variation Splices and Build Element-Variable Catalog
Take the time to carefully match up these really sensitive areas in your site code to the variations which you intend to affect. Insert the code for the testing scripts where they need to be. (Remember, they need to be ordered and set correctly based on the OUTPUT of server executed code). Once these splices are in place, Google will present the interface to begin creating variations and give you the opportunity to load in the codes from your table. Name these variations based on some system of organization which will promote quick and simple identification. Test them over and over again to ensure that you’ve not negatively impacted any major feature on your site. (I can speak from experience that these codes can play hell on site-search features when they’re hosted away from your normal code. Ensure that you either can include or exclude this page from your testing for certain. Having to reset is very frustrating).
Step 6. Do a Full Final Review of Every Variation
Look for things like variation duplication, missing or misspelled image location URLs, text variation miscues, color issues, anything that doesn’t look right. Close out each one on your spreadsheet or checklist. Make sure that you have every possibility looked into and have another person come in and look at each. Extra eyes help at every step.
Step 7. Cross Your Fingers and Execute
Once you’ve gone down the list at least a half-dozen times, hit the button and let it live.
Step 8. Restart the test with a copy after making the changes to the items which you overlooked.
Every test which I have set up so far with a medium/high degree of difficulty has required that I immediately stop the test after a rep or associate finds something wrong with a variation. I’ve come to love this process and the Touch of Grey it creates. Set it up again, relaunch and prepare to repeat steps 6-8 until you start getting clean tests with good data.
Within a couple days, you can get the first series of relevance ratings. In a few more, you should start to see divergence. By a couple weeks, a clear winner will get way out ahead. Just sit back and let the data do the work for you. By 40 days or so, depending on the number of combinations and the traffic, you should have a very complete package and incredible insight as to what YOUR customers are doing and relating to as valuable on your site. This should give rise to several other analysis scenarios and Voila. Multivariate testing success.
Hope this is helpful. I know its kind of complicated. Feel free to send in comments or questions. More to follow.
Daniel W. Shields
Daniel added the following ...
Hey, the feed is fixed. We’re running full tilt.
The 30 minute cookie is a problem The cookie is limited to the session. My guess is that Google will be aware of this and working on some type of a fix if people continue to give their feedback about the test designs and the results their seeing. As I see it, the squeaky wheel gets the grease.
As for the servers…that’s quite a problem. So, if I understand correctly the call to the server ignores the information on the cookie from a previous session? I’ll talk to my friends in the IT department in the morning and see what they have to say about that.
Thanks for reading. Sorry for the delay in the response. I’ll try to speed things up from now on.
Sincerely,
Daniel W. Shields
Analyst/Contributor

Corey Mathews added the following ...
Hi Daniel - welcome aboard! Website Optimizer is top of mind for me right now. We recently ran a large campaign to test 2 different home page designs with Website Optimizer, and while the functionality was there - and great for a free service - I had 2 big issues:
1) The default 30-minute cookie timeout meant that we couldn’t automatically limit a user to version a or version b for the entire campaign. We were able to modify the _utimeout variable to address this, but we lost some time before I figured that out.
2) We load balance our site to 2 front-end servers, each with a separate subdomain. Website Optimizer dropped a different cookie for each, which meant that even with the timeout issue addressed, it was impossible to lock them to a or b because they could load balance to the other server when they came back (or at least had a 50% chance of doing so).
Based on that, I don’t think I’ll use it for any sort of large-scale multivariate testing again. It would be fine for testing conversion rates on different ad copy, for example, but we’ll explore other options for anything more comprehensive.
P.S. Could you please get your feed set up so I can stay up to date?