OPINION20 September 2011

Failed experiments


Everyone likes to talk about the successes of their research techniques and methodologies, but we often tend to brush the failures under the carpet. Not GMI Interactive’s Jon Puleston – here he lays bare his survey design mistakes.


I witnessed a case of this at a conference I recently attended where someone was talking about conducting video-based online interviews, which we experimented with a few years ago with not much success. I could see they were heading in the same direction and I feel a bit guilty now that we did not publish the findings from this failed experiment at the time.

So I thought I’d make up for this by laying bare a few of our dead-ends as a gesture towards encouraging more open collaboration on what works and what doesn’t in survey design

Not an engaging performance
Three years ago, we invested a lot of thought and effort experimenting with the idea of video surveys where we got actors to deliver the questions as we thought this might make the surveys more engaging. It was failure on several fronts, firstly the research company we worked with kept changing the questions, not an uncommon issue – in fact it’s the norm – and so we had to re-record the actors speaking the script three times at tremendous cost. Then, integrating these videos into the survey made the survey really slow and cumbersome to load restricting it to people with decent broadband connections, perhaps less of a problem now in many markets. The third problem we faced – and which we did not even think about when starting out – was that up to one third of people were doing our surveys in public situations, like offices, where it would be annoying for other people to hear someone speaking the survey questions.

We experienced a more than 30% drop-out rate from these surveys which really rather undermined the whole exercise.

That’s not to say that the video did not work as an engagement technique – it did for those that were prepared to watch it – but we found that by using a silent animation technique instead we could stimulate a similar quality of response and this approach was far more cost effective.

The realities of virtual shopping
Virtual shopping is a very popular idea, and a virtual shopping module looks great in anyone’s portfolio of technical online survey solutions. But to be candid, we have found it almost impossible to properly emulate a shopping experience online.

Here are the problems. Firstly, looking at a group of products on a web page is nothing like the experience of being in a shop looking at products on the shelves. In a shop the shelf at eye level gets the most attention and the products on the top and bottom shelves are not looked at so often. On a web page our eye scans from the top left to bottom right naturally (in western markets) and so there is no way you can effectively model the same experience.

The second issue is one of pure statistics. Most real shopping experiences have around 20 or so competitive products to choose between, but if you do the maths with 20 products there is only a 1 in 20 chance a test product will be selected and to get a statistically significant measure on whether one product will sell more than another you need at least 50 selections, ideally 100, which means sample cells of 1,000 to 2,000 per design variant. So if you were, say, testing out five designs you might need to interview 10,000 people which is simply not economical.

Naive to this fact, when we first started creating virtual shops we were designing them with 50 to 100 products, with samples of a few hundred resulting in only two or three purchase instances of any one product, which made it almost impossible to make any real sense of the data.

There are also further cost factors involved. When we started out, we would spend days designing wonderfully accurate renditions of a supermarket shelf, with 3D depth effects and shadowing and even went to the length of experimenting with 3D environments which were taking up to three weeks to create with a price tag of £10k to £20k per project – just to create the shelves. Unfortunately, the average budget to design test a product is less than £10k – including sample – and there are often significant time constraints.

Screen resolution is also problematic. If you cram 20 or so items on a page it becomes almost impossible to get a real sense of what the product looks like, read the labels or details and this factor alone is enough to make many comparisons almost meaningless. When you are in a shop looking at products it’s amazing how much detail you can see and pick up on without even holding the items; that is simply missing when you are looking at a fuzzy pixel on a screen.

The solution we have reverted to for design testing is a really quite a simple shelf with between 3 to 8 competitive products on display, and we have developed a virtual shopping module that enables us to create these dynamically without having to do any elaborate design work, cutting down dramatically on cost and creation time.

Incompatible formats
We are in the business of developing creative questioning techniques and have come up with quite a number of different formats over the last few years – but some of them have crashed and burned along the way.

Multiple drag and drop questions
Dragging and dropping is a particularly problematic format, especially if you want people to drag multiple items – we have found that people get bored of the process very rapidly which restricts some of the most creative applications of this technique. Dragging and dropping is a brilliant solution for single choice selection process because it can produce measurable reductions in straightlining and improved data granularity. But if you have three brands, say, and a range of attributes that you want respondents to drag and drop onto the brands, you would be much better off doing this with conventional click selections.

The flying words question
We had this idea that we could make tick selection more fun if we made the options fly across the screen and ask people to click on them as they went past. It took one experiment to realise it was not going to be a very usable question format. A combination of bemusement among respondents and the realisation that not everyone is 100% focused, 100% of the time resulted in about half the number of clicks being registered compared to a traditional question.

Opinion snowboarding
This is a question format where respondents are presented with a snowboarding avatar which they have to steer through gates to indicate their choices. It looked fantastic and I was very excited when we first developed it – and most respondents thought it was a fun way of answering questions. The problem is that in the world of survey design ‘most’ is not enough – around 15% of people found the format annoying and what we got out of the back of it was really quite chaotic data from people who could not make up their minds in time. We tried to slow the snowboarder’s descent to give people more time to think but that just started to annoy another group of people who became frustrated with how long the thing was taking.

We have not given up on this question format yet though. We are working on a simplified version using a single poll that respondents ski to the left or right of, which seems to perform much better, and we feel it may have an interesting use for implicit association-style research where you force people to make quick decisions. But as a swap out variant for a conventional grid question? Forget it.

  • What are your biggest survey design mistakes? Tell us in the comments thread below

Jon Puleston is head of GMI Interactive. This is an edited version of a post that originally appeared on his blog Question Science. Republished with permission.