Claire Emes
FREELANCE

25 June 2014

Avoiding the Big Data traps

By applying some of the basic principles of survey research and statistics, Ipsos MORI Digital’s Claire Emes says many of the short-comings of Big Data can be overcome.

Much has been written on the pros and cons of Big Data; in fact the White House recently published a report titled ‘Big Data: Seizing Opportunities, Preserving Values’ which examines how big data is changing the way we live and work.

What follows is certainly not a definitive guide, rather it is a brief critique of Big Data from a researcher’s perspective and how, by applying some of the basic principles of survey research and statistics, we can overcome many of its shortcomings and unlock its value.

Size isn’t everything

In his book, The Signal and the Noise: Why So Many Predictions Fail but Some Don’t, Nate Silver suggests the quantity of information in the world is increasing by 2.5 quintillion bytes per day, but the amount of useful information almost certainly isn’t. He explains that most of it is just noise, and the noise is increasing faster than the signal. There are so many hypotheses to test and so many data sets to mine, but according to Silver there is only a relatively constant amount of objective truth to find.

It’s quality not quantity

Taking the principle of bigger isn’t always better one step further, I’d suggest it’s not only not better, it can actually be worse. A number of proponents of Big Data refer to a Big Data set as one where ‘N = All’, where we no longer have to sample as we have access to the entire background population. But is ‘N = All’ really a good description of most available data sets? Do we ever really have all of the data?

As the economist Tim Harford and Microsoft’s Kate Crawford, among others, point out, most Big Data sets contain systematic biases. It takes careful thought to identify and correct for these skews. Big data sets can seem comprehensive but ‘N = All’ is often a seductive illusion.

Think of Social Media, it is in principle possible to record and analyse every message on Twitter and use it to draw conclusions about the public mood, but even if we analyse every tweet, Twitter users are not representative of the population as a whole. According to Ipsos MORI’s tech tracker only 15% of the UK population are on Twitter and they are disproportionately young and from higher social grades. In most situations, we’d be better analysing an infinitely smaller but representative sample of the population we’re wishing to understand.

Known unknowns and unknown unknowns

Another issue is that if we rely on ‘found’ data alone, we’re constrained by what exists. As Nate Silver pointed out in his interview with Ipsos MORI’s CEO, Ben Page, “The credit rating agencies in advance of the crunch had millions of observations on individual mortgages, but all from a period when housing prices were increasing”.

It can be risky to rely entirely on past observable behaviour and algorithms. If you have no idea what is behind a correlation, you have no idea what might cause that correlation to break down. This is particularly concerning when people feel that they can be more certain about their predictions because the size of the data set means that they’ve got the numbers to back it up.

Perhaps we could go so far as to suggest that Big Data can be dangerous. Big Data can mean big errors. The data can be wrong or misleading, but more often than not there are errors in interpretation rather than the data themselves. This is frightening if authorities wrongly predict a health scare (or fail to) and frustrating if a company tries to sell you something you already have or simply aren’t interested in.

Further, Big Data models do not just predict, they can make things happen by creating a behavioural loop. A person feeds in data, which is collected by an algorithm which then presents the person with choices, so steering behaviour. This can create efficiencies but it’s easy to see how this could result in yet more data skews or could be abused.

So Big Data can be unwieldy, misleading and possibly even hazardous but, despite this, we and many of our clients are genuinely excited about the opportunities it presents. Many of the projects we’re undertaking today leverage Big Data sources and techniques and we expect this to apply to even more of our work in the future.

Our experience suggests there are some key principles we need to consider to ensure we don’t fall into any Big Data traps.

Ask the right questions. As with any study, a crucial element in managing a Big Data project is asking the right questions. In particular, how you define the problem.
Evaluate the data. Once we’ve established what data we need, it’s important that we evaluate the quality of the data. As with most things, what you get out is only as good as the data you put in.
Be aware of your assumptions. Not only do we need to understand the data and where it has come from, but we also need to consider the assumptions behind any models that the data is fed into and how these may differ from reality.
Underpin with statistics. We should apply the rules of small data to Big Data – we need to understand any skews in the data and the probability of our predictions. This is familiar territory. As researchers we underpin our work with statistics, we are used to dealing with bias and we present our survey data with confidence intervals. Big Data should be no different.
Consider the privacy implications. The research industry is well placed to address the weighty issues surrounding privacy and data protection because anonymity and respect for the individual are core considerations in our work.
Combine data sources. Our experience has demonstrated that we create the most value by combining data sources: a client’s operational data, third party data, ‘found’ or public data, data we’ve collected passively via monitors and – more often than not – survey data (as we’ve found Big Data raises as many questions as it answers). This ‘data mash-up’ provides a more holistic picture of the ‘what’ and the ‘why’ allowing us to generate richer insights.
Tell a story. Finally, the most sophisticated data analysis in the world won’t cut through if you don’t find a compelling way to communicate your findings.

In summary, while Big Data may not be the answer to all our questions, it can certainly provide a very useful contribution and, when combined with other sources of insight, helps us develop a deeper understanding of people’s motivations and behaviour.

Claire Emes is head of Ipsos MORI Digital

We hope you enjoyed this article.
Research Live is published by MRS.

2 Comments

BigDataGuru

11 years ago

Small minds = small industry.

Like Reply Report

Matt Champagne

11 years ago

I enjoyed this article. This one is bookmarked. ;)

Like Reply Report

Opinion

Powered by The Research
Buyers Guide

FIND YOUR NEXT AGENCY.

Advanced Search

Interviews

How I work: Vidisha Gaglani, CEO, Streetbees

2 Jul Liam Kay-McClean

Feature

Kathryn Blanshard: ‘The art of storytelling, persuasion and relationships is key’

19 Jun Katie McQuater

Feature

How I work: Aaron Kechley, CEO, Zappi

12 Jun Liam Kay-McClean

Newsletter

Stay connected with the latest insights and trends...

Sign Up

Featured Company from the RBG Directory

Town/Country: London
Email: info@mmr-research.com

Fighting ordinary innovation for the world's leading brands since 1989. MMR Research Worldwide is your global consumer and sensory research partner. We truly believe that great product experiences unlock the . . .

Latest From MRS

Our latest training courses

Our new 2025 training programme is now launched as part of the development offered within the MRS Global Insight Academy

See all training

Specialist conferences

Our one-day conferences cover topics including CX and UX, Semiotics, B2B, Finance, AI and Leaders' Forums.

See all conferences

MRS reports on AI

MRS has published a three-part series on how generative AI is impacting the research sector, including synthetic respondents and challenges to adoption.

See the reports

Latest

Themes

Specialisms

Regions

About

Sign in/Register

Search

Avoiding the Big Data traps

Size isn’t everything

It’s quality not quantity

Known unknowns and unknown unknowns

We hope you enjoyed this article.Research Live is published by MRS.

2 Comments

Display name

Email

Join the discussion

Display name

Email

Join the discussion

Display name

Email

Join the discussion

FIND YOUR NEXT AGENCY.

Related

Intelligence managers see big future for big data

Popular

Why research needs anthropologists (more than ever)

MRS names 15 Research Heroes for 2025

Synthetic data ‘lacks logical consistency’, finds Strat7 study

Insight insiders: The Good Side’s Kathryn Blanshard on independent thinking & proving insight’s worth

Statistics Authority recommends the government conducts England and Wales census

Ipsos UK CEO Kelly Beaver: ‘Real people are the core and heart of our industry’

Interviews

How I work: Vidisha Gaglani, CEO, Streetbees

Kathryn Blanshard: ‘The art of storytelling, persuasion and relationships is key’

How I work: Aaron Kechley, CEO, Zappi

Newsletter

Featured Company from the RBG Directory

Latest From MRS

Our latest training courses

Specialist conferences

MRS reports on AI

Progress faster...with MRS membership

Mentoring

CPD/recognition

Webinars

Codeline

Discounts

Find your next agency...

We hope you enjoyed this article.
Research Live is published by MRS.

Progress faster...
with MRS
membership