NEWS13 March 2024

Data quality: ‘We cannot solve this unless we come together’

AI Annual Conference 2024 Asia Pacific Data analytics Europe Finance Latin America Middle East and Africa News North America UK

UK – Data quality in research is an increasing problem that the industry must address collectively, and determine the cost of not mitigating fraud, according to a panel at the MRS annual conference.

Data panel at annual conference 2024_crop

Speaking in a session moderated by Debrah Harding, managing director at MRS at the conference yesterday ( 12th March), Melanie Courtright, chief executive at the Insights Association, said there is a need to get the language right when discussing data quality. 

Courtright said: “I think it’s important to start with language. Data quality is the umbrella term, so everything to do with the fitness of a data set. It includes a lot of things other than fraud. Fraud is an element of it – a malicious intent to obfuscate your identity. If we are going to talk about the state of data quality, we need to start with a common, accurate language and measure against that language.”

Courtright explained that the Global Data Quality initiative, a coordinated effort by the Insights Association, MRS and seven other industry associations, has completed its glossary and will next focus on releasing a set of industry benchmarks.

Asked for his perspective as a supplier, Simon Glanville, managing director at Ronin, said the problem of fraud is worsening. “I’m approaching this very much from a B2B perspective. 90% of the work we do is with a business audience, often with a very senior business audience, where the incentives to participate are quite high. Typically if we’re doing a broad recruitment campaign, we’ll see somewhere around 20% of fraudulent activity. We believe it’s getting more organised and that’s something we can try to collectively tackle as an industry. It’s an increasing problem and it’s not going to just go away.”

The problem is not confined to market research. Glanville pointed out that LinkedIn has done a lot of work to address the issue of fake profiles – for example, in late 2022, the number of employees listed as working for Apple on the social network dropped from over half a million to fewer than 285,000 after LinkedIn cracked down on bot accounts.

Glanville added: “That’s the kind of scale that we’re looking at. If you’re looking at validating profiles, then we need to look at smart tools.”

Debbie Lawrence, group head of data management at the London Stock Exchange Group, discussed how the data quality issue manifests in the financial sector, saying: “Having fraudulent or fake data going out into the market would be catastrophic from a reputational standpoint. Fake data in financial services would be catastrophic. It would be on the front page of the Financial Times.”

Lawrence said while data quality and lineage are something the financial services sector worries about, they avoid using the word “fraud” in this context. “The financial services industry is powered by data. Financial portfolio decisions are based on data. We don’t use the word ‘fraud’ because it obviously has a kneejerk reaction in any financial organisation. We do worry about data quality and data lineage.”

A highly regulated environment is the industry’s “friend” when it comes to data, added Lawrence, because it means it forces the need for transparency. She said: “When you get audited – we provide financial data all around the world – we have to be able to prove where the data came from and that’s really key. That’s different from the world [of market research]. The concept of relying on a population to source data would be deemed risky in our world, because we have to be able to prove provenance, and for that we need tools. You need to be able to invest in capabilities for the data as it flows through systems.”

Discussing approaches to address data lineage in the financial services industry, Lawrence said a lot of organisations are investing in technology to address it, adding: “It requires investment and it involves a majority of organisations to get behind the technology.” 

However, Lawrence noted: “As an industry, you have to look at cost-benefit. What is the cost of fraud in your industry, and is it worth investing to mitigate against it? At some point, you get to a tipping point where everybody goes ‘this is getting too much” or the fraudulent data creates some sort of market moving event, as we would call it in financial services. Nobody wants something splashed across the front of a newspaper off the back of fraudulent data.”

On the next steps that should be taken to address data quality in research, Courtright urged the sector to get involved in the Global Data Quality initiative.

“We have nine strong associations who are going to make a difference in this area, so my encouragement is to get involved in that initiative, wherever you are in the world. Each of those industry organisations has a work stream. We’re going to build a warehouse of standards and best practices and we’re going to lean into starting certifying some of those things. I would encourage you to start thinking about that now and to get involved because we cannot solve this unless we all come together,” Courtright said.

The industry also shouldn’t shy away from AI as a tool to help tackle data quality, said Lawrence. “You’ve got to run towards AI, not away from it but also find a way to digitise your data as much as possible. Start small, be prepared to fail fast. Work out what to invest in.”