Sponsor

Generative AI changed everything. Fully synthetic audiences didn't

Generative AI has transformed how the insights industry works. But when it comes to replacing real respondents with fully synthetic audiences, the promise fell short. Samuel Cohen argues it is time to separate what works from what doesn't.

Fairgen-article

Let us give generative AI its due. In the space of two years, it has fundamentally reshaped the insights industry for the better. It has accelerated analysis, automated reporting, unlocked new ways to process unstructured data and made sophisticated natural language tools accessible to every researcher with a browser. These are real, lasting gains. No serious person disputes them.

But there is one specific application where the technology has not delivered: fully synthetic audiences. The idea that a generic large language model, trained on the open internet, could reliably stand in for real human respondents has been tested – and found wanting. When researchers evaluated these outputs against real panel data, the results too often failed to capture the nuance and variance that define genuine consumer insight.

This distinction matters. Dismissing generative AI wholesale because one application underperformed would be as foolish as the original over-claiming. The more productive question is: why did fully synthetic audiences fail, and what does a better approach look like?

The answer lies in the data, not the algorithm.

When you ask a generic LLM to "act like a consumer", it returns the statistical average of the internet. It smooths out the very differences that researchers are paid to discover. Profiles of a 65-year-old retiree can vary enormously with respect to almost any attitude or behaviour, yet a generic prompt collapses all of that richness into a single, bland composite. The signal is weak because the model draws on everything, which means it represents no one in particular.

The concept of the digital twin takes a fundamentally different approach. Rather than generating fictional respondents from the open internet, a digital twin is a one-to-one simulation built from the deepest, most specific data available – grounded in multi-dimensional panel data spanning demographics, behavioural patterns, transactional history and attitudinal sentiment.

Crucially, these twins must be strictly siloed. Unlike open-ended LLMs, purpose-built digital twins are constrained by the specific, validated panel data they are trained on. When a twin responds to a concept test, its reaction is shaped exclusively by the real profile of the respondent it was modelled on – not by the statistical average of the internet. If the data is not in the underlying panel, the twin cannot hallucinate an opinion about it.

This architecture allows researchers to either access pre-built audiences from premium data providers, or instantiate their own proprietary audiences by uploading past quantitative and qualitative studies to train company-specific twins. The result is a defensible, directional signal grounded in real data, not a generic guess.

But even with this data-first architecture, the industry must hold itself to a higher standard of honesty than the first wave of synthetic hype demanded.

Digital twins are not a replacement for real field research. High-stakes decisions – your foundational segmentation, your multi-market brand tracker – will always require the rigorous investment of traditional fieldwork. That is not going to change, nor should it.

Where digital twins excel is in the ‘squeezed middle': the hundreds of medium-stakes decisions that marketing and product teams make every year without any customer validation, simply because traditional research is too slow or too expensive for every question. Optimising ad messaging on a Tuesday afternoon. Testing a pricing scenario before a Friday board meeting. Validating a product concept before committing development resources.

In these scenarios, digital twins offer a powerful complement to traditional methods. They provide directional insights, not definitive conclusions. They are the ‘second-best to fieldwork’ option – faster, more affordable, and transparent about the gap.

Generative AI has given the insights industry extraordinary new capabilities. The mistake was assuming those capabilities extended to replacing real human respondents with internet averages. They do not. But when the same technology is grounded in premium, siloed panel data, it becomes something genuinely useful: a way to keep the customer at the centre of everyday decisions, while reserving rigorous fieldwork for the questions that demand it most.

The lesson is not that AI failed. The lesson is that the data underneath matters more than the model on top.

Fairgen-3

Samuel Cohen is chief executive and founder at Fairgen, an AI company building infrastructure for simulated audience research. Contact: samuel@fairgen.ai

We hope you enjoyed this article.
Research Live is published by MRS.

The Market Research Society (MRS) exists to promote and protect the research sector, showcasing how research delivers impact for businesses and government.

Members of MRS enjoy many benefits including tailoured policy guidance, discounts on training and conferences, and access to member-only content.

For example, there's an archive of winning case studies from over a decade of MRS Awards.

Find out more about the benefits of joining MRS here.

0 Comments


Display name

Email

Join the discussion

Newsletter
Stay connected with the latest insights and trends...
Sign Up
Latest From MRS

Our latest training courses

Our new 2025 training programme is now launched as part of the development offered within the MRS Global Insight Academy

See all training

Specialist conferences

Our one-day conferences cover topics including CX and UX, Semiotics, B2B, Finance, AI and Leaders' Forums.

See all conferences

MRS reports on AI

MRS has published a three-part series on how generative AI is impacting the research sector, including synthetic respondents and challenges to adoption.

See the reports

Progress faster...
with MRS 
membership

Mentoring

CPD/recognition

Webinars

Codeline

Discounts