FEATURE1 December 2009

Ascribe ACM reviewed

Technology

Tim Macer reviews the latest incarnation of this verbatim content management and coding environment. Will our reviewer subscribe to the charms of Ascribe?

A little while ago, Language Logic estimated that their Ascribe online coding product was probably handling over fifty per cent of all the open-ended coding generated by research agencies in the United States, and a decent proportion from the rest of the world too. The challenge is where you go next, when you have half the market and no real rivals. One direction is to grow the market for verbatims, by making it possible to code the vast number of open-ends that never get coded – and the new Ascribe Automated Coding Module (ACM), promises to do just that.

In at the beginning
I happen to know something about the technology behind this tool, because it I worked on a prototype with the online bank Egg (and even co-presented a paper on it at the 2007 Research Conference). Language Logic has subsequently worked with its creators, the Italian government’s research foundation ISTI-CNR, to integrate the technology into Ascribe. Though I am often hesitant to state anything is the best, the ISTI-CNR engine is easily the best I have found as it is the most MR-savvy of any automated text-processing technologies. This is not a discovery or text mining tool – it is a coding department in a box.

ACM closely mimics the normal human-intervention coding process, and fits seamlessly into the traditional Ascribe workflow. By using machine learning, it does not attempt to interpret, or extract meaning by looking up words in dictionaries – in fact, it does not use dictionaries. Instead you provide it with examples of how you would classify your data into a codeframe and then set it to learn from this. In Ascribe this means you simply start coding the data in the way you normally would. As you code, you are creating the training set that ACM needs. When you have coded enough to create a decent training set, you take your foot off the
pedal and let ACM accelerate through the rest.

Classifier information
First, you build the ‘classifiers’ that will identify matching answers. These work by looking for telltale features of the examples you coded. For any individual answer, it could create thousands of these unique features – patterns of words or letters. So many, in fact, that it easily overcomes problems of poorly spelt words, synonyms and so on. When the classifiers have been built, you can then apply them to your uncoded data and it will categorise them too, applying a confidence score to each coding decision it takes – you can adjust this threshold to make it more or less sensitive. It takes just a few seconds to zip through thousands of verbatims. There is a process for validating the coding decisions the ACM has made, and it will helpfully present validation examples in order of those where it was least confident of its coding decision.

A manageable step
This validation step makes the system very manageable, as you can understand what it is doing and you can improve its performance by correcting any assignment errors, and even react to changes over time. It feels uncanny, too, as the marginal decisions it identifies are often the ones that have the human coders debating where it should go too.

Not that you have to use the ACS with Ascribe – it does command a premium in pricing over manual coding and it is only really suitable for larger volumes. The overhead of training and validation is comparable to manually coding a couple of thousand interviews. However, it can also be applied to qualitative projects and web content, such as blogs.

Even manual coding in Ascribe is highly optimised, with tools to let you find similar answers, code by word or phrase matching, and if you wish, re-categorise items at any point. You use it both to create your codeframe and assign answers to it in one integrated step. It’s a multi-user system, and you can give assign responsibilities among the team: some can build codeframes, others only code, and others only analyse. Ascribe also has a surprisingly rich set of analytical tools – even cross-tabbing capabilities. You are not restricted to uploading only the verbatim texts, but the entire survey can go in. It can handle data from SPSS Dimensions now with ease,and it is totally integrated into Confirmit using the Confirmit Web Services interface. Upload routes are provided for most other MR packages.

Summary
It’s not the prettiest of tools to use: the interface may be on the web but is hardly of the web and is in need of a makeover. Language Logic is redesigning some modules as thin client Window apps, which have a better-looking interface, but it would improve the approachability of Ascribe if it’s web interface is better structured and designed. True, it is productive to use, but it does not help you get there as a novice, and the documentation (which is being redone at present) is not as comprehensive as it needs to be. It’s a pity as both make it a challenge to harness all of the power that is in this otherwise remarkable system.

Client perspective: Joy Boggio

Joy Boggio is director of coding at C&R Research Services, a full-service agency in Chicago. Joy introduced Ascribe to C&R in 2004. Ascribe is used for all verbatim coding on quant studies at C&R and also some of their qual projects. Joy explains:
“Within a day or two of introducing Ascibe, we immediately cut down the delivery time on project by, in some cases, a week. The features of Ascribe that are the most attractive are it being web-based – you can easily hand out the work very easily to many different people in many different places; if you have had the study before, you can merge it with the previous study and autocode a part of it; you are not restricted in the formats of data you can input, nor are you restricted in how you export the data out, and we can do some rudimentary data processing within the tool.”

Although C&R has a research staff of around 60, Joy is able to support all of the verbatim coding activities with a team of just three coders. But it is not only the coders that use Ascribe – many of the researchers also use it to access the verbatim responses, using its filtering and analytical capabilities to indentify examples to include in reports and presentations. “It means they can dive down a little deeper into the data. The problem you have with the process of coding data is that you can flatten out the data – the challenge is always to make sure you can retain the richness that is there. With Ascribe you can keep the data vibrant and alive – because the analytical staff can still dive into the data and bring some of that richness to the report in a qualitative way.”

Joy notes that using Ascribe telescopes the coding process, saving precious time at the start. “It’s now a one-step process, instead of having to create the codebook first, before getting everyone working on it. With this, as you work through the verbatims you are automatically creating codes and coding at the same time, so you don’t have to redo that work. When you are happy with the codebook, you can put others onto the project to code the rest. This is where the efficiency comes in.”
Joy estimates that it reduces the hours of coding effort required in a typical ad hoc project by around 50 per cent, but due to the ease of allocating work and the oversight the system provides, she remarks, “You are also likely to save at least a day of work on each project in management time too.”

C&R Research makes extensive everyday use of the manual coding optimisation tools Ascribe offers, such as searching for similar words and phrases, but so far has only experimented with using the new automated machine learning coding in ACM. Joy comments: “It seems to be more appropriate for larger volumes of work – more than we typically handle. It would really lend itself to the larger tracking study and result in a lot less people-time being required.”

The verdict: Ascribe ACM

Intelligent verbatim content management system and coding environment for researchers and coders, with options for either manually assisted coding or machine-learning automated coding for higher volumes. Delivered as either web browser-based or web-enabled desktop software modules.

Ease of use: 4 out of 5
Cross-platform compatibility: 5 out of 5
Value for money: 4 1/2 out of 5

Cost
Conventional coding: between 3 and 5 US cents per verbatim coded. Automated coding: between 10 and 30 US cents per verbatim coded.

Pros

  • Automated coding option will code thousands of open-ends in seconds
  • Machine learning mimics human coders and produces comparable and highly consistent results
  • Many tools to optimise effort when coding manually
  • Web-based environment makes it easy to distribute coding work to satellite offices and outworkers

Cons

  • Automated coding only saves time on larger projects such as trackers
  • Web-based interface needs a refresh
  • Windows only – requires Internet Explorer


Further info: www.languagelogic.net

2 Comments

13 years ago

"Conventional coding: between 3 and 5 US cents per verbatim coded. Automated coding: between 10 and 30 US cents per verbatim coded." Maybe there is an error. I think prices are swapped. Automatic coding is cheaper than manual coding.

Like Report

12 years ago

Hi , Based on your review, Ascribe seems the best online coding software. I would like to know the main competitors to Ascribe. Thanks, E Piro

Like Report