OPINION21 April 2016

Testing significance

x Sponsored content on Research Live and in Impact magazine is editorially independent.
Find out more about advertising and sponsorship.

Behavioural science Impact Opinion UK

Making a proper business decision can often mean trying a ‘slightly wrong' option, rather than a ‘boringly right’ one, argues Rory Sutherland. 

Menu crop

If you were to ask me what is the most valuable thing I have done in my working life, I think the answer’s quite easy. I once asked a client in the restaurant trade what they did when an item on the menu wasn’t selling very well.

“We drop the price,” they replied.

“Good idea. But do you try raising the price first?”

“No.”

“Go on, try it occasionally.”

They did. The first time they took my advice, demand went up. A lot.

To be honest, I was lucky. The tenets of mainstream economics are mostly correct in that increasing the price of something will depress demand, but it’s not an iron-clad rule. For instance, when people choose from a menu, the usual price-demand relationship is weakened. Some hungry diners may be disposed to buy pricier items. Also, since restaurant visits are often seen as a guilty treat, there may be a disposition not to skimp.

Being boringly right is a poor aspiration

In one sense, I was giving bad advice. The balance of probabilities was that raising the price would cut demand, but I wasn’t interested in that. My point was that the cost of being wrong was different in both cases. Drop the price and demand falls – catastrophe; increase it and demand falls – survival; increase it and demand rises – [adopt Borat voice] High five!

Being conventionally right is often what people are looking for in decision-making. It’s a good way to protect your job, after all. But, if you wish to make a proper business decision, being boringly right is a poor aspiration; apart from anything else, you will end up like your competitors. It’s also important to consider the consequences of each eventuality alongside the likelihood of the outcome. If the cost of trying the wrong answer is low, and the upside potentially high, it’s worth trying a ‘slightly wrong’ answer.

I notice this disparity between ‘reliability of findings’ and ‘consequences of findings’ when talking to academics. Social scientists are obsessed with p-values. Any experiment where the probability of the result being subject to chance is less than 5% (p<0.05 ) is regarded as statistically significant and is hence career- advancing; fall just outside that range and you have just wasted three years of your life.

I occasionally ask academics whether they have any interesting failed experiments we can use. You see, if just 20% of people do something anomalous 10% of the time, that’s useless in an academic paper but valuable to a business.

What I didn’t know until recently was that, when it first arose, the creators of statistical significance-testing were alert to this problem. William Sealy Gosset, better known as ‘Student’, and the deviser of the ‘Student’s t-test’ regarded the figure of 0.05 as arbitrary. He believed the degree of certainty you should look for depended heavily on what use you planned to make of the result. If you were intending to place a huge bet with a small return, 0.05 was still dangerously high; for other purposes it was far too low. Gossett worked for Guinness, and could see that the degree of certainty required depended on the consequences. R A Fisher, unusually for a certified genius, more or less overruled him.

Many in the economic and decision sciences regard this arbitrary barrier as absurd; for instance here’s Deirdre McCloskey’s view:

Let us examine the 5% rule of statistical significance. When a gambler bets, does she insist on 19-to-1-on odds before choosing a horse? What does a rational brewer do about the 5% rule when buying hops to make a beer he sells for profit? Should the UK Parliament or US Congress enforce a rule of 19-to-1 odds, or better, for a new surgical procedure that may save the life of the nation’s leader?

Yet this arbitrary number is still oddly influential. There may be a reason that it is prevalent in academia – academics’ sole currency is reputation, and to bullet-proof it you need to prove you are right. But the currency of business is money. To make money, you don’t need to be right – you simply need to be less wrong than someone else. Deirdre McCloskey believes that significance testing has caused significant loss of life in the pharmaceutical industry. In the rest of industry, I suspect, it has caused significant loss of wealth.

Rory Sutherland is vice-chairman, Ogilvy & Mather UK

2 Comments

9 years ago

Our industry's obsession with significant testing at a 95% confidence level is bizarre, but so engrained that people assume it is sacrosanct. One free-thinking client said to me a few years ago that if research tells him that there is a 67% chance that the blue one is better than the red one that was enough for him to make and justify a decision - especially when lives aren't at risk. But he is unfortunately the exception. It is one more example of quantitative researchers hiding opinions behind a wall of data - and as we rarely see significant differences in the sorts of sample sizes we tend to work with, its a perfect excuse for not making any decisions at all. And thereby not upsetting anyone. And having an easy life. And we wonder why market research doesn't have a prominent enough voice in client organisations....

Like Report

9 years ago

What I also find interesting is how we continually ignore effect sizes. We get excited when a small effect size is statistically significant and disappointed when a large effect size isn't quite significant. We need to do a much better job interpreting data using all of our knowledge about statistics. It's not JUST statistical significance or JUST effect sizes. It's those plus context plus actionability plus reliability and validity.

Like Report