Coding abstract_crop

OPINION4 February 2019

Has R become the universal research tool?

Data analytics Opinion UK

Market researchers should embrace coding and James Smythe argues the case for R opening data analysis opportunities for everyone in market research.

The time of separate market research and analytics teams, with separate skills, tools and data portfolios, is coming to an end. It’s not the outcome many researchers feared, where good numbers would be made redundant by fast ones.

It’s because of the development of programming languages – and one in particular – which are as relevant and easy to use for a qual researcher as for a big data scientist. Market researchers, R is now the best thing you can learn.

In many businesses ‘research’ and ‘analytics’ teams sit well alongside each other and cross-fertilise skills – the former adopting new technologies to enhance traditional techniques, the latter filtering torrents of data with a critical researcher’s mind.

They are, however, still using different tools to do so. While the analyst may be accessing APIs, writing SQL or Python and presenting data in Tableau, the researcher is still getting data via Excel, copy/pasting and presenting their thoughts in Word and PowerPoint. If you need to access a big web data file written in gobbledegook, you can now get an analyst to do it for you. But for a researcher, that’s no substitute for analysing it yourself.

Researchers’ traditional data table formats favour reading rather than analysis, and in data science terms are quite hard to make use of.  This is the main reason why many researchers have struggled to embrace tools like Tableau: it works fine with the tidy training data but try to feed in research tabs and you’ll soon give up and go back to PowerPoint. The reasons for this inertia in reporting methods, while data is exploding are not entirely clear, but the barriers to change may have finally fallen.

R broke into the top 10 programming languages in 2018, which is quite an achievement for a language with its origins in statistics and graphing, not web programming. The breakthrough has come largely thanks to its developments in data manipulation, and a group of packages called the Tidyverse. Building on the work of a passionate R open source community, Tidyverse offers a natural and ‘plain English’ method to read, transform and chart data. Imagine something that’s easier to code than Excel, and which opens any imaginable data source to analysis.

There are now R packages to read or write just about anything, such as SQL, SPSS, JSON, SAS, web APIs, analytics tools, Google Sheets, web scrapers, and of course good old Excel.

Once into R, you can do what you like with the data, in moments. Re-shape it, query it, join datasets together, run advanced stats. The kind of thing you needed a cleverer colleague to do in the past, and which might take days to get back once requested.

For repeated data tasks, you can write scripts which not only automate the task but do it safely and before the kettle has even boiled. Expensive desktop data transformation tools like Alteryx are cleaning up right now, but it’s much better for your budget to be able to do it all without a licence fee.

A year or so ago we tackled the problem of reporting a brand tracker for a major UK media company. The agency delivered a 100Mb set of excel tabs each month, and researchers had to open several of these to run any kind of trend analysis, copying and pasting into another workbook.  Transforming 18 months of data into a 2m row dataset for dashboarding used to take two days, even with Excel VBA macros. With R, we got it down to a little under 10 minutes, and right every time.

For those qual researchers who have read this far on the promise of the first paragraph, there are also great benefits. Packages like Tidytext unlock meaningful words and phrases from transcripts or open-ended verbatims, quantify and code with built-in sentiment lexicons, bringing additional layers of insight.

Learning enough R to transform a researcher’s use of data can take as little as a day. Once you’ve started, you may get hooked. R has packages to build charts of data journalistic quality, mass-produce PDF reports or PowerPoint decks, or publish interactive, responsive web apps. New packages are being published every week, and the web is teeming with free, very accessible learning resources. 

Now get coding. Just don’t forget the brackets().

James Smythe is managing director of Culture of Insight