Synthetic data ‘lacks logical consistency’, finds Strat7 study
The report, ‘Putting Synthetic Data to the Test: A Real-World Evaluation,’ outlines findings of an independent comparison of synthetic data quality, with Strat7 researchers comparing two synthetic data providers using evaluation criteria drawn from typical client use cases.
Strat7 conducted the study in partnership with customer data science company Dunnhumby, whose Shopper Thoughts community provided the respondent data used in the evaluation.
The study found that both synthetic data providers created boost data that broadly aligned with real data at a basic level, such as brand awareness – generally matching patterns in the holdout data (the data set not ‘boosted’ synthetically). However, the researchers identified significant issues when they conducted more complex analyses.
The study found:
- Boosted responses lacked the logical consistency of real people
- A ‘bunching effect’, where fewer responses were found at the extremes of the scale
- Divergent outcomes in key driver analysis
- Inconsistent segmentation.
Based on its findings, Strat7 has recommended that the use of synthetic data should be limited to no more than 5% of the overall sample – and should only be used to boost underrepresented demographic groups.
The company has also concluded that synthetic data is not suitable for analyses involving segmentation, key drivers or behavioural prediction.
Peter Strachan, insight director at Strat7, said: “There’s real promise in synthetic data, but as it stands today, it’s not ready to take centre stage. However, as we all know, the technology is evolving rapidly. This study simply represents a snapshot of its current capabilities – not its ultimate potential. To put it another way: the technology is the worst it’s ever going to be – it’s only going to improve. We’ll keep testing, and we’ll keep challenging it to improve.”
Strat7 said both providers involved in the study – who it has not named – have acknowledged the limitations highlighted by the study and have said they are developing improved versions.
Research Live spoke with Tabita Razaila, Strat7 head of operations, and Hasdeep Sethi, group AI lead at Strat7, to further discuss the study.
Strat7 conducted the research to examine whether synthetic data can supplement or replace real survey responses, particularly in hard-to-reach demographics.
The business started exploring the issue over a year ago as it wanted to understand how the model looks at real data and populates it with synthetic data based on different criteria, according to Razaila.
Discussing the background to and motivation for the conducting the study, Razaila said: “We started chatting to this supplier, we got access to the platform, we took two projects that we already ran for some of our clients and uploaded those to the tool and we played with it to see how it looks like. We saw different things coming out of it – parts where it wasn't doing very well and parts where it was doing well, but we realised that the projects that we used were not necessarily fit for purpose because they were not very complex. They didn't have all the question types.”
Razaila continued: “All of the suppliers have their own success criteria, which you can understand when you're doing a trial and you can see it and have a metric for it, but then afterwards when you do real life projects, because you have nothing to compare it against, you don't know if it’s worked. Basically, you need to rely on the trial and what you've done there to say if it’s worked or not.”
Strat7 also wanted to be able make recommendations to clients. Razaila said: “We also had a few clients in the last year that have asked us about [synthetic data], but they didn't know much about how it works, so they were trying to ask us and rely on our experience. So that’s why we drove this forward – so that we know where we actually stand. Is it something that we want to offer or is something that we're we're taking a step back from for now until the the tools develop more?”
To conduct the study, Strat7 came up with five success criteria. While synthetic panel providers have their own definitions of success, this is “normally quite narrow,” according to Sethi. "It’s valid but it’s not enough for us to have confidence that we can use it," he said.
Accuracy was the first of the criteria. Sethi said: "The top line, basic criteria was: does it look like real data, when you look at it question by question?
“Then, we looked at the data in more detail with what we call consistency and variation – so, when you look across questions, when you do cross tabs, do those relationships hold up? Is there the same spread of responses you get in human sample?”
The final two criteria related to key drivers – would synthetic and real data point to the same conclusions when analysing what factors most influence purchasing behaviour? – and segmentation – whether synthetic and real data would offer similar results when grouping based on responses.
Approach
Using the success criteria, the study compared artificial responses against real survey data from a purposefully designed questionnaire sent to Dunnhumby’s online research community.
The researchers split real survey responses into ‘training’ and ‘holdout’ sets (the first representing an incomplete data set that is being supplemented by the synthetic responses, and the second being the data set that was withheld from synthetic providers and can be used as an independent benchmark for comparison with the synthetically boosted data).
The study’s aim was to measure how well synthetic ‘boost’ data (synthetic data created by the two providers to supplement the existing real data, comprised of artificially generated responses) matched the withheld real responses.
Strat7’s study found that a basic level, both synthetic data providers performed relatively well, suggesting that synthetic data can reasonably approximate real data for reporting of top line metrics such as brand awareness or purchase frequency.
However, the results of synthetically boosted data differed ‘significantly’ depending on which provider was used, according to the white paper. Additionally, the results suggested that current synthetic data technology, while sophisticated enough to produce responses to individual questions, cannot produce coherent respondents as logic and consistency broke down when researchers compared responses across multiple questions.
The study also found that key driver analysis resulted in different conclusions about what drives consumers when using synthetic data, compared with real data.
Challenges
Understanding the different terminology involved in synthetic data proved a challenge, according to Sethi. "There’s a training data as a holdout, as a test, and those terms take a bit of getting used to.
“There is an operational challenge with synthetic data because you have to communicate what those things mean to clients if they want to use synthetic data.”
The paper highlights that as there is no clear industry standard for evaluating synthetic data, the choice of provider could have a significant impact on research results. Is there a need for an industry standard?
Sethi said: “There probably is a need for one, but it’s tricky to know what that should be because the accuracy metric they [synthetic providers] return is valid, but it’s an academic one, so it’s saying that if you introduce a boost to the real data, it can predict the holdout better than if you just had the real data on its own. So it’s trying to create data which almost reduces the variance when you go and predict something new.
“I think that’s valid, but the problem is, clients that we work with don't understand that and probably don't care about that. They care about: “Could that boost be a real person?” So, the metric would have to be at the respondent level … [such as] how realistic is that respondent across all the questions – that you have some kind of way of expressing that as a percentage term. It sounds doable, but every survey is different, and it’s hard to normalise this stuff.
“So, I think there is a need for it, but what that metric is, is another matter – and it’s probably for the panel providers to think about, as well as us.”

We hope you enjoyed this article.
Research Live is published by MRS.
The Market Research Society (MRS) exists to promote and protect the research sector, showcasing how research delivers impact for businesses and government.
Members of MRS enjoy many benefits including tailoured policy guidance, discounts on training and conferences, and access to member-only content.
For example, there's an archive of winning case studies from over a decade of MRS Awards.
Find out more about the benefits of joining MRS here.
0 Comments