OPINION22 February 2024

Leveraging academia and working together can improve data quality

AI Asia Pacific B2B Data analytics Europe Latin America Middle East and Africa North America Opinion UK

Striking the right balance in tackling data quality issues is like walking a tightrope, says Karine Pepin, but industry collaboration and leveraging academic research can help to navigate this complex problem.

view looking up at a person walking a tightrope between tall buildings

In recent years, the discourse surrounding data quality has reached new heights within market research circles. From industry conferences to podcasts and social media discussions, the spotlight on data integrity has never been brighter.

The inception of the Global Data Quality partnership, a coalition of trade associations, underscores the industry’s commitment to combating fraud and elevating data quality standards.

Addressing data quality issues is a multi-level issue which involves grappling with myriad factors, ranging from sample validation (or lack thereof) to survey design, participant engagement, and technology fraud (eg bots, click farms, AI-generated responses, IP obfuscation, etc), says Carrie Campbell, member of CASE, an industry group created to ensure a quality foundation for marketing data intelligence. It is a constant game of cat and mouse: as the industry becomes more sophisticated in our fraud detection, fraudsters become more adept at bypassing those measures.

With increased awareness of these issues came heightened scrutiny placed on the data. Given that many of these challenges lie beyond researchers’ direct control, we are compelled to double down on strategies within our reach, like implementing rigorous in-survey quality controls and conducting thorough post-survey data cleaning. However, striking the right balance is akin to mastering the delicate art of walking a tightrope – leaning too far forward or backward risks a fall.

Balancing vigilance with participant experience

Our instinct as researchers is to safeguard our studies, ensuring that we continue to deliver the highest quality insights to our clients. While the intent to maintain data integrity is laudable, the lack of consensus on effective quality control measures can lead to unintended consequences.

Although we acknowledge that retaining fraudsters in our analysis has a negative impact on the research, rejecting legitimate participants is also problematic: it may lead to wasted time and resources, introduce bias into the insights, and weaken the current state of the research ecosystem by alienating valuable participants.

As Mary Draper, vice-president, network partners and quality at EMI Solutions, says: “While we have to be aggressive and thorough to ensure we're not letting the bots and click farms monopolise our datasets, we also have to be very careful that we're not making survey taking a negative experience for real respondents. We must have both diligence and grace.” 

Time for clarity

The repercussions of false negatives and false positives extend far beyond individual studies, significantly impacting the broader research ecosystem. To tackle this challenge, the industry must collaborate to identify the most effective quality control measures. Efrain Ribeiro, CASE member, advocates for concerted research efforts to establish industry standards — a sentiment echoed by many in the field.

“The industry collectively should conduct research to determine the best approach to ‘cleaning’ survey results,” Ribeiro asserts. “I'm not sure why there isn't more work done in this area and shared across our industry. This type of research goes a long way toward establishing industry standards – something the industry has always needed.” 

Leveraging academic insights

Fortunately, the market research community can draw upon a wealth of academic literature dedicated to understanding and addressing data quality issues.

Associate professor Brooke Reavey, associate professor of marketing, says: “The academic world is concerned about fraudulent data because of the impact that it can have on decision making in various areas such as health care, public policy, and business recommendations. In the grand scheme, academics serve as a neutral third party who identify patterns and suggest frameworks to help solve societal problems. Additionally, many academics are also operating on a shoestring budget. It’s not unusual to hear of an academic with an annual budget of $2000 to spend on their research. Thus, they are often swayed by CPI and assume that because their study has a high incidence rate, they are immune to fraudulent data.”     

As a member of the Insights Association Data Integrity Initiative, I discussed in a recent LinkedIn  post my resolution for 2024 to review the academic literature and share my findings with the market research community. With almost 200 reactions, 100 comments, and many individuals already sharing their thoughts and resources, the energy surrounding this initiative is palpable.

Understanding which quality control measures work effectively is paramount. We invite those passionate about advancing data quality in market research to join the initiative in reviewing academic papers and sharing insights with the community. 

By harnessing the wealth of knowledge available and fostering collaboration, we can collectively contribute to ensuring the sustainability of the industry. 

Karine Pepin is co-founder at The Research Heads