FEATURE1 May 2011
FEATURE1 May 2011
Companies are turning to text analytics tools in an effort to make sense of what customers are saying about them online. Paul Golden investigates.
Those who bemoan the impact of email, text messaging and Facebook on our ability to write tend to forget one thing: that the people who have grown up with these tools write far more than any generation before them. They might not have much respect for the conventional rules of writing, but they’re making the language their own, and adapting it to different formats and audiences.
In the process they’re producing vast amounts of data, which marketers are keen to make sense of. That means understanding the meaning and sentiment of text – and numerous tech companies are queuing up to help researchers do this, each with grander claims than the last about their tools’ ability to accurately interpret and categorise millions of online comments. These text analytics tools are adept at picking up brand references from the massive volumes of online communication generated every day. But opinion is widely divided on their ability to determine whether a comment is positive, negative or neither.
Rights and wrongs
A study conducted last year by FreshMinds found that automated sentiment analysis tools were only accurate about 30% of the time – in other words you’d be much better off flipping a coin. However, providers of such tools argued that it wasn’t fair to test their software without first training it on language relating to the topic in question.
FreshMinds’ lead research manager Anna Tomkowicz told Research: “In our experience there is no single leading text analytics tool and it is impossible to check whether each claimed product improvement actually makes a difference – although in the early days they did not identify re-tweets and spam as accurately as they do now.”
Social media monitoring firm Radian6 recently announced plans to incorporate text analytics capabilities from six different providers into its service – indicating that users are still looking for a range of answers rather than trusting just one. But that doesn’t seem to be dampening interest in the area – Radian6 was bought by CRM firm Salesforce in March for $326m (see below).
There is, of course, always a trade-off between cost, timing and accuracy when trying to analyse text. But if you are going to base decisions on research, the tools have to be robust, says Tomkowicz. “Sentiment is evaluated by proximity of positive or negative words to the brand. Mixed sentiments are highlighted as neutral by the analytics tools, but we distinguish between mixed and neutral data because this means the person is happy about some features of the brand but unhappy about others, and this is very valuable data for the client.”
Other important considerations include the length of the text you’re looking at. Sentiment detection tends to work better on Twitter compared to other social media sources, she says, because of the 140-character limit on tweets.
Mark Westaby of Spectrum Consulting is confident that automated analysis can do the job. Spectrum uses semantic association to measure sentiment – a technique based on the associations made by the brain between words, and which does not require software to be trained. “This method is academically proven and we have carried out our own, extensive research which reveals a high correlation of meaning,” says Westaby.
Whether or not analytics tools can recognise irony or sarcasm is a point often raised by advocates of human analysis, but Westaby says it’s a “red herring [that] never arises in our work”. “Given the volumes of text we analyse, errors created by irony or sarcasm are very small and fall well within acceptable limits,” he said. False positives and negatives are not a big issue either, he says, for the same reason.
“We have developed our own version of the Likert scale, which is very effective for tracking and analysing sites such as Twitter where the restricted number of characters means people typically use words such as ‘love’ or ‘hate’ to express tonal sentiment.”
“Irony, sarcasm and slang are close to impossible for computers to understand, so there needs to be human intervention to guarantee accuracy”
Jessica Whittaker, BuzzNumbers
But not everyone is satisfied with the tools on offer. Jessica Whittaker of social media monitoring firm BuzzNumbers is not alone in saying that her company has struggled to find technology that could produce accurate measurements of sentiment. “Irony, sarcasm and slang are close to impossible for computers to understand, so there needs to be human intervention to guarantee accuracy,” she said.
Sentiment measurement is usually based either on dictionary definitions of words, or on natural language processing. Dictionary systems are as accurate and up to date as the dictionary they’re based on – emerging language can be added, but only once someone notices it’s missing.
Natural language processing systems are more flexible – using algorithms that learn rules by analysing large sets of example documents, in which certain terms have been tagged with information about what they mean or how they relate to the rest of the text. These systems are quicker to catch slang and new language, but anomalies do arise. For example, if a system finds a group of comments saying things like ‘I hate Christmas because my family are horrible’, it will code Christmas as negative – probably a mistake. It may take a while for enough new positive mentions of Christmas to come through and change it back.
Both these approaches have flaws, but then so do people, says Annie Pettit, head of research at Conversition. “The important thing to remember is that we don’t care if individual messages are coded correctly. We care that the aggregate coding across large numbers of messages validates well. We aren’t trying to prove whether I hate or love Pop-Tarts, we are trying to prove whether a million people hate or love Pop-Tarts. A reading that is 70% accurate is actually a very accurate reading.”
The scale offered by automated solutions also means you can get away with less accuracy, says Pettit. “If humans can score hundreds of messages and validate at 85%, and computers can score millions of records and validate at 70%, then computers are a valuable option for large datasets.”
Text analytics providers are set to benefit as social media monitoring catches the eye of business. Customer relationship management firms have shown particular interest in the area, with Salesforce recently paying $326m for social media monitor Radian6, and Overtone getting snapped up by Kana Software.
The tools seem to be of particular interest to companies looking to get an ongoing view of what their customers are up to, and an ability to respond quickly when there are problems. Salesforce said that integrating Radian6 into its services will allow it to provide “real-time social intelligence”.
For Radian6 customers, there will be no single text analysis solution. The firm revealed it was partnering with OpenAmplify, Klout and OpenCalais to incorporate their text and sentiment analytics capabilities into its platform, and it also plans to build in analysis from Clarabridge, PeekYou and Lexalytics. CEO Marcel Lebrun said the partnerships had allowed the firm to “amass the largest index of semantically enriched social data in the world”.
Another social media tracker, Visible Technologies, secured a further $6m of investment shortly after the Radian6 acquisition, with CEO Kelly Pennock saying the buy had “raised the stakes” for other social media monitoring providers.
The written word
In order to handle the nuances of language, the software needs to have really good coding that takes account of slang, as well as the capability of understanding words in context and learning over time, says Theo Downes-LeGuin, chief research officer at Market Strategies International. He acknowledges that sarcasm is one of the toughest things to code because the tone and emphasis often rely on broader contextual cues that go beyond the available text.
So a good text analysis tool will recognise ‘Thanks, NatWest, for charging me usurious rates’ as sarcasm because the usual positive affect of ‘thanks’ is outweighed by the strong negative of ‘usurious’. But a sentence like ‘Thanks, NatWest, for charging me such wonderfully high rates’ might be wrongly coded as positive, unless it existed as part of a larger block of text that provided more context.
Many social media measurement solutions base their sentiment analysis on whether predetermined ‘good’ and ‘bad’ terms crop up in the same phrases as the term that’s being looked for. Lisa Joy Rosner, chief marketing officer at NetBase, says the problem with these approaches is that understanding language is much more complex than simply looking for the presence and proximity of certain words. False positives and negatives crop up, and no tool on the market does a perfect job of getting sentiment right. Then again, says Rosner, it’s all relative. “In many studies on human annotation to the task of sentiment analysis, it was still found that accuracy never really exceeds 85%.”
Scott Blacker, senior director of product management at survey software maker Vovici, says most of these tools “work out of the box”. “The bad ones are bad and even the best rarely go above 80%, but then humans also disagree. The sentiment engines can be tuned to improve the accuracy of the systems so the more investment in tuning the categories, the greater the accuracy will be. The [size of the] initial data set for tuning these tools is also vital since the larger the data set, the larger the sample size to teach the engine.”
Vovici co-founder Jeffrey Henning adds that some vendors have attempted to compare human efficiency to machine, “but not many have done it with scientific accuracy. Most leading vendors claim their solutions work in other major languages, particularly European languages, and they tend to test well in Spanish, for example. The vendors who take more of a natural language approach tend to be stronger in English or French.” However Henning says he has yet to see a really good solution for comparing a survey across numerous languages.
“If humans can score hundreds of messages and validate at 85%, and computers can score millions of messages and validate at 70%, then computers are a valuable option”
Annie Pettit, Conversition
The first step in any automated sentiment detection strategy is to define what you want to measure, says Seth Grimes, founder of analytics consultancy Alta Plana Corporation. “Once you have detected, you can create aggregate measures, plot and compare trends and so on. Tackle these not-so-basic basics before you take a shot at complexities such as irony and sarcasm because that stuff is very difficult to decode systematically. If you do want to automate, you are almost certainly going to need linguistic techniques that match word use and patterns to vocabulary and phrases that indicate irony and sarcasm.”
If you’re dealing with languages other than English, don’t fall into the trap of thinking you can just translate into English, analyse and then translate back again – expressions of sentiment are particularly tricky to translate. For these situations, Grimes suggests conducting simple analysis and leaving the rest to people. “For a less-used language, start by creating a lexicon of sentiment-bearing words (‘like’, ‘love’, ‘hate’, ‘bad’) in the target language and use it to detect sentiment for further analysis by a person. A partially automated solution of this nature will surely be better than human-only analysis, if only in its reach.”
Another variable that makes text analysis challenging is that not enough is understood about how people behave, communicate and interact online. For one thing, the language they use is distinct from that used in other areas, and is evolving fast. In theory, slang, vernacular and abbreviations shouldn’t cause a problem as long as they are correctly labelled in the texts used to train the software. But these things are changing constantly, so the task of building text analytics tools is never finished.
Campbell Keegan director Rosie Campbell says: “There is a way of interacting and communicating online that has its own rules, so regular social media contributors, particularly young people, almost never write anything that is not ironic and I am not sure text analytics has the capacity to tease this out.”
Market research isn’t the only business that has shown an interest in text analytics.
The techniques are also used widely in military intelligence, IT security, legal research and even to identify potential problems in manufacturing processes. Companies increasingly need to consider ‘information governance’ – managing their records and ensuring compliance.
“Text analytics has become in some organisations such a popular thing to do, it’s actually changed the nature of surveys”
Nick Patience, 451 Group
Within the field of market research, the use of text analytics isn’t confined to social media. In a webinar hosted by Attensity, Nick Patience of the 451 Group described how it has allowed companies to change the way they write questionnaires by allowing them to quickly and effectively analyse open-ended questions.
“It’s become in some organisations such a popular thing to do, it’s actually changed the nature of surveys. We’ve seen organisations that have had, say, a survey of 30 questions, the first 29 of which have been structured… then the last question would be, ‘Do you have any other comments?’ Text analytics has changed that, such that some of the more forward-thinking organisations have actually gone for all open-ended questions. You couldn’t do that before text analytics if you had any decent number of responses. That’s one area where text analytics has quite fundamentally changed things.”
According to the Terminator movies, 19 March this year was the day that the world’s computers became sentient, triggering nuclear Armageddon and the destruction of most of the human race. Fortunately, machines have yet to take over from people in the real world, and there seems to be a consensus that insight professionals still have a vital role in making sense of data from automated analysis – and coming up with strategic insights.
Grimes and Henning agree that a hybrid approach is the way forward. Machines can be used for their speed, reach, scalability and consistency, but people are needed to train, guide and oversee automated systems and interpret findings.
In fact, a huge benefit of automated analysis is that it shifts costs from data collection and allows much more time to be invested in insight analysis, which is as it should be.
“In some ways automated analysis creates more opportunities for research,” suggests Henning. “Some firms outsource data collection for reasons of cost, but these tools could allow them to bring this in-house.”
Clearly it’s not just the ability of text analytics software to learn that will make it more useful, but the ability of researchers to learn how best to deploy these tools in their work.