FEATURE9 January 2013
FEATURE9 January 2013
There isn’t a one-size-fits-all tool for data visualisation work. DataMarket’s Hjalmar Gislason reviews what is available to help researchers find the best solution for their needs.
There is no single correct answer to the question, ‘What is the best tool to visualise data?’ It depends on the task at hand and what you want to achieve. So here’s an attempt to categorise those tasks and point to some of the tools I’ve found useful.
The most common tool for simple charting is Microsoft Excel, where it is possible to make near-perfect charts of most types – if you know what you’re doing. But many Excel defaults are sub-optimal: some of the chart types they offer are simply for show and have no practical application. 3D cone shaped “bars” anyone? And Excel makes no attempt at guiding a novice user to the best chart for what they want to achieve. Here are three alternatives.
Tableau is fast becoming the number one tool for many data visualisation professionals. It’s client software (Windows only) that’s available for $999 and gives you a user-friendly way to create well-crafted visualisations on top of data that can be imported from all of the most common file formats. Common charting in Tableau is straightforward, while some of the more advanced functionality may be less so. Then again, Tableau enables you to create elaborate interactive data applications that can be published online and work on all common browser types, including tablets and mobile handsets. For the non-programmer who sees data visualisation as an important part of their job, Tableau is probably the tool to use.
DataGraph is a little-known tool that deserves a lot more attention. A very different beast, DataGraph is a Mac-only application ($90 on the AppStore) originally designed to create proper charts for scientific publications. Nothing we’ve tested comes close to DataGraph when creating crystal-clear, beautiful charts that are also done right as far as most of the information visualisation literature is concerned. The workflow and interface may take a while to get to grips with, and some of the more advanced functionality may lie hidden even from an avid user, but a wide range of samples, aggressive development and an active user community make DataGraph a really interesting solution for professional charting.
R is an open-source programming environment for statistical computing and graphics. It’s a super-powerful tool that takes some programming skills to get started. However it is becoming a standard tool for any self-respecting data scientist. R does a lot more than graphics within its interpreted and command-line-controlled environment. It enables all sorts of crunching and statistical computing, even with big datasets. In fact, the graphics are a bit of a weak spot for R. Most outputs needs polishing in other software such as Adobe Illustrator to be ready for publication.
If you are creating data visualisation videos or high-resolution data graphics, Processing is your tool. Processing is an open source integrated development environment (IDE) which uses a simplified version of Java as its programming language and is especially geared towards developing visual applications.
The area where we have found that Processing really shines as a data visualisation tool is in creating videos. It comes with a video class called MovieMaker that allows you to compose videos frame-by-frame. Each frame may well require some serious crunching and take a long time to calculate before it is appended to a growing video file, but the results can be stunning.
There are dozens if not hundreds of programming libraries that allow you to add charts to your websites. Most of them are rubbish. We believe we have tested most of the libraries out there, and there are only two we feel comfortable recommending:
If you want full control of the look, feel and interactivity of your charts, or if you want to create a custom data visualisation for the web from scratch, the out-of-the box libraries mentioned previously will not suffice. In fact you’ll be surprised how soon you run into limitations that force you to compromise on design. So if you want to take it up a notch and follow the lead of some of the wonderful and engaging data journalism happening at the likes of the New York Times and The Guardian, the tool for you will probably be one of the following:
D3.js is in many ways the successor to Protovis, building on many of the same concepts. The main difference is that instead of having an intermediate representation that separates the rendering of the SVG (or HTML) from the programming interface, D3 binds the data directly to the DOM representation. If you don’t understand what that means – don’t worry, you don’t have to. But it has a couple of consequences that may or may not make D3 more attractive to your needs. The first one is that it makes rendering faster therefore animations and smooth transitions become more feasible. The second is that it will only work on browsers that support SVG, so you will be leaving behind users of Internet Explorer versions 7 and 8.
After thorough research of the available options, we chose Protovis as the base for building out DataMarket’s visualisation capabilities – with an eye on D3 as our future solution when modern browsers finally saturate the market. We see that happening about two years from now.
Hjalmar Gislason is the founder of DataMarket.com