FEATURE8 May 2009

Following the swine flu conversation

Data analytics Features Government News North America Trends

Linguastat helps those tracking the virus track what’s being said about it

US— A buzz tracking tool created by San Francisco startup Linguastat has been put to the test over the last few weeks, as the Center for Disease Control and Prevention (CDC) uses it to track mentions of swine flu.

Linguastat’s automated system reads and analyses media coverage of the spread of the virus, as well as mentions on social media sources such as blogs and social networks including Twitter.

Research spoke to Linguastat’s CEO and founder John Pierre about applying the technology to a global virus outbreak.

What does your intelligence tracking service do?

It’s used to mine open-source intelligence which is really blogs, Twitter and that sort of thing, to find facts that otherwise aren't apparent. We have a fair number of government customers including the US Army, the Food and Drug Administration, the Federal Trade Commission, and we also have some commercial customers.

It can be configured to be very specific to the customer’s needs, so in government there may be more of a surveillance or monitoring type of function, looking at certain types of events, while in an industry setting it may be more commercial events and transactions.

How is it being used to combat swine flu?

The earliest report the CDC gets each day is a 6am report which draws from thousands of articles, gives a very consolidated picture and counts them up based to how often different messages have appeared. Our technology can gauge and score the various types of facts and messages. They can scan the report and say, ‘Here are the statements people are making today about swine flu.’ They'll take that and turn it into a briefing for their daily press conferences and conference calls. It helps inform the decision makers in Atlanta and Washington.

Are you trying to track what’s happening, or what people are saying about what’s happening?

We do both but the emphasis of this project is really the latter. The CDC has a fair amount of infrastructure to detect outbreaks through other reporting channels. They use our tool to get the types of things that people are saying, the types of recommendations people are making. They want to find messages related to prevention and best practices – if they put out statements or recommendations they want to make sure those are being picked up, and they also want to track misinformation – that’s something they were doing with Twitter.

How does automation change this sort of work?

That was a big benefit here. We've been working with the Oak Ridge Institute on this, and they used to do this all by hand. Now they can get much more timely reports that cover more sources. When there were the Sars and bird flu outbreaks a few years ago they had a similar crisis, but it was very difficult to get through all that material by hand. Since that time there are new sources that have come to bear like blogs and Twitter which have really amplified the amount of reading material that someone would need to go through. We're covering a couple of orders of magnitude more content than in the Sars outbreak.

Author: Robert Bain