Q&A: WholeSum’s Emily Kucharski on finding richer signals

Research Live speaks to Emily Kucharski, co-founder at qualitative data analysis platform WholeSum, to hear about the opportunity that scaling text analysis offers research.

Emily Kucharski

What’s your background, and how has it led to founding WholeSum?

My background is very much touching on the market research and consumer insight space. I was at creative agencies for a decade – J Walter Thompson at WPP and then Saatchi and Saatchi at Publicis, among others. I did a lot of creative testing, market research, consumer insights work, and I helped launch a couple of insights consultancies while at WPP.

I saw how much money and time is invested in understanding audiences, but how rigid research at scale often is because of what’s easiest to analyse. There were very little open-ended questions at scale – that was done much more on smaller sample sizes in a qualitative stage.

This really became, what we call in the startup world, a ‘hair on fire problem’ for us in my previous venture: I built a parenting insight app three years ago, as a frustrated parent. It involved crowdsourcing experiences from parents of young kids around all sorts of topics. We initially asked closed-ended multiple-choice questions because that was easier to analyse, but users wrote in, saying that we were not capturing the full breadth of their nuanced experiences. People wanted us to ask some free text questions, so we adopted the ChatGPT API to help us do that.

Initially, it felt like magic, but I really felt the pain – quickly I was seeing quotes that were clearly made-up; numbers that weren't adding up.

Integrity was crucial to what we were doing – my co-founder is a statistician and data scientist – and it just wasn't good enough. I was spending a huge amount of time checking and amending these summaries. We started building a better data science pipeline to make that work because it was such a pain point.
Over the course of two years, iterating almost weekly on the proprietary data we were getting from our parenting community, we built something that we then realised was advanced in terms of text analysis.

We started talking to market research agencies, particularly and universities. I wanted to go to the types of organisations that were doing manual qualitative analysis to see if it would pass the test with them. Once we got that validation, we have since also expanded to larger scale use cases.

How does WholeSum work? How does it compare with a large language model (LLM)?

Our mission is helping organisations find the signals that matter most from complex audience text data – and that can be quite broad. It can be open ends, it can be transcripts, social media conversations, reviews, field team notes, feedback, etc.

Where we differ from LLMs is that we are first and foremost a statistical inference framework with statistical models. We use LLMs in a very constrained way for what they're very good at, which is finding the patterns in language.

It is auditable – you can figure out why each signal or theme has been attributed to individuals – and reproducible – you can put the same data in twice and you get the same results, which an LLM will not do.

We are able to process vast amounts of data. If you're using LLMs, they are pretty good at small sample sizes, but the quality of analysis and performance quickly degrades with the more you put in. There’s the hallucination side of it as well, which everyone is so familiar with. We don't use generative steps to compile things like quote summaries and numbers. That is something we don't use AI for.

WholeSum received pre-seed funding of £730,000 earlier this year. What’s the plan for that investment?

First and foremost – hiring and upping our technical capabilities. I am more on the commercial, applied side, dealing with customers and growing the business.

My co-founder is the statistician and he’s built the product. We are hiring a founding engineer who is an AI computer science specialist and also another applied scientist.

This space is moving very quickly. AI is, of course, moving very quickly. So, R&D is crucial as an ongoing priority for us as well.

Are you working with any market research agencies?

Yes, we've done a lot of pilots. Last year was our pilot year of validation, and we’ve just launched our API. A lot of those pilots were with market research agencies.

Where it’s getting interesting is what this sort of technology opens up long term. What we found is that it’s actually probably more the quant researchers that really get excited about how study design would change if you could ask open-ended questions and quantify that as easily as you can with multiple-choice and closed questions.

Over time, you could start to uncover more meaningful trends and predictions, because qualitative data can contain much richer signals than quant.

I think one of the biggest compliments we've had was from a very experienced qualitative researcher who used the tool and they said even with more time, they wouldn't have got to the same results.

Can you share any examples of use cases?

We're currently working with big pharma on field team notes. They have field team reps going out and having 10 conversations a day with doctors and clinicians, often with very little time. There are shorthand notes written in free text form that sit in databases and they have really struggled historically to extract signals from that and keep key information.

What we've been able to do with them is firstly structure it in a way that can enrich CRMs; they're finding a lot of their CRM information was out of date because qualitative wasn't being joined up with historical quant data. We're looking at the emerging themes and signals that come out of that – emerging adoption barriers, threats and opportunities.

The third step we do with the statistical toolkit is the ‘so what?’ That is looking at what is coming out strongly, and asking how it maps back to quant variables. For example, which segments are really showing that there’s an opportunity for more selling here, and therefore, how can you reprioritise efforts on targeting, based on what that data is really telling us at scale?

With one of our early design partners,  Imperial College London, we looked at student reflections – the students were filling out large surveys with qualitative reflections and self-scoring on closed questions like resilience and grit scores. What we found with them is that the qualitative was more predictive of outcomes, at the end of the master’s programme, than closed questions.

The market research sector is at a bit of a turning point with AI. What’s your view on how it’s being applied currently?

There’s a ton of interesting innovation going on at the data collection end of it, with AI moderated groups and synthetic data sets, for example. I think it’s actually increasing access to real voices as well, which I see as a really positive thing.

But it strikes me that the analysis side of it still seems to be a massive opportunity, especially on qualitative data. I've had so many conversations that it is either still being done very manually or it’s being reduced to unreliable AI summaries. For some cases, that might be OK, but we should be treating qualitative data in the same way as quant, with the same kind of rigour that you apply to that.

So, I think it’s very exciting. I'd like to think that it will open up more inclusivity in terms of really listening to people in a deeper way. I've read studies that people are more willing to open up to AI. I don't know if that’s true for everyone – but I would like to think that this really allows organisations to listen more, and more deeply, to people.

Research Live has edited this conversation lightly to improve clarity.

We hope you enjoyed this article.
Research Live is published by MRS.

The Market Research Society (MRS) exists to promote and protect the research sector, showcasing how research delivers impact for businesses and government.

Members of MRS enjoy many benefits including tailoured policy guidance, discounts on training and conferences, and access to member-only content.

For example, there's an archive of winning case studies from over a decade of MRS Awards.

Find out more about the benefits of joining MRS here.

0 Comments


Display name

Email

Join the discussion

Newsletter
Stay connected with the latest insights and trends...
Sign Up
Latest From MRS

Our latest training courses

Our new 2025 training programme is now launched as part of the development offered within the MRS Global Insight Academy

See all training

Specialist conferences

Our one-day conferences cover topics including CX and UX, Semiotics, B2B, Finance, AI and Leaders' Forums.

See all conferences

MRS reports on AI

MRS has published a three-part series on how generative AI is impacting the research sector, including synthetic respondents and challenges to adoption.

See the reports

Progress faster...
with MRS 
membership

Mentoring

CPD/recognition

Webinars

Codeline

Discounts