FEATURE9 January 2013

Practical applications

Trends

There isn’t a one-size-fits-all tool for data visualisation work. DataMarket’s Hjalmar Gislason reviews what is available to help researchers find the best solution for their needs.

There is no single correct answer to the question, ‘What is the best tool to visualise data?’ It depends on the task at hand and what you want to achieve. So here’s an attempt to categorise those tasks and point to some of the tools I’ve found useful.

Simple one-off charts

The most common tool for simple charting is Microsoft Excel, where it is possible to make near-perfect charts of most types – if you know what you’re doing. But many Excel defaults are sub-optimal: some of the chart types they offer are simply for show and have no practical application. 3D cone shaped “bars” anyone? And Excel makes no attempt at guiding a novice user to the best chart for what they want to achieve. Here are three alternatives.

Tableau is fast becoming the number one tool for many data visualisation professionals. It’s client software (Windows only) that’s available for $999 and gives you a user-friendly way to create well-crafted visualisations on top of data that can be imported from all of the most common file formats. Common charting in Tableau is straightforward, while some of the more advanced functionality may be less so. Then again, Tableau enables you to create elaborate interactive data applications that can be published online and work on all common browser types, including tablets and mobile handsets. For the non-programmer who sees data visualisation as an important part of their job, Tableau is probably the tool to use.
Tableausoftware.com

tableau

‘Tale of 100 Entrepreneurs’, created in Tableau, shows inflation adjusted revenue growth for 100 of the largest public software companies

DataGraph is a little-known tool that deserves a lot more attention. A very different beast, DataGraph is a Mac-only application ($90 on the AppStore) originally designed to create proper charts for scientific publications. Nothing we’ve tested comes close to DataGraph when creating crystal-clear, beautiful charts that are also done right as far as most of the information visualisation literature is concerned. The workflow and interface may take a while to get to grips with, and some of the more advanced functionality may lie hidden even from an avid user, but a wide range of samples, aggressive development and an active user community make DataGraph a really interesting solution for professional charting.
visualdatatools.com/DataGraph

R is an open-source programming environment for statistical computing and graphics. It’s a super-powerful tool that takes some programming skills to get started. However it is becoming a standard tool for any self-respecting data scientist. R does a lot more than graphics within its interpreted and command-line-controlled environment. It enables all sorts of crunching and statistical computing, even with big datasets. In fact, the graphics are a bit of a weak spot for R. Most outputs needs polishing in other software such as Adobe Illustrator to be ready for publication.
r-project.org

Videos and custom high-resolution graphics

If you are creating data visualisation videos or high-resolution data graphics, Processing is your tool. Processing is an open source integrated development environment (IDE) which uses a simplified version of Java as its programming language and is especially geared towards developing visual applications.

Processing is great for rapid development of custom data visualisation applications that can be run directly from the IDE, compiled into standalone applications or published as Java Applets for publishing on the web. Java Applets are less than optimal for web publication, but a complementary open-source project – Processing.js – has ported Processing to JavaScript using the Canvas element for rendering the visuals (Canvas is a way to render and control bitmap rendering in modern web browsers using JavaScript). This is a far superior way to take Processing work online.

The area where we have found that Processing really shines as a data visualisation tool is in creating videos. It comes with a video class called MovieMaker that allows you to compose videos frame-by-frame. Each frame may well require some serious crunching and take a long time to calculate before it is appended to a growing video file, but the results can be stunning.
Processing.org

Processing

This image was created using Transversal Lines, an interactive digital art project by Joao Martinho Mourha (jmartinho.net) built using Processing

Charts for the web

There are dozens if not hundreds of programming libraries that allow you to add charts to your websites. Most of them are rubbish. We believe we have tested most of the libraries out there, and there are only two we feel comfortable recommending:

Highcharts is a JavaScript charting library which renders vector-based, interactive charts in SVG (or VML for older versions of Internet Explorer). It is free for non-commercial use and commercial licenses start at $80. It is a flexible and well-designed library that includes all the most common chart types with plenty of customisation and interactivity options. Interestingly, even though Highcharts is a commercial product the source code is available to developers who want to make their own modifications or additions. With plenty of examples, good documentation and active user forums, Highcharts is a great choice for most development projects that need charting.
highcharts.com

highcharts

A wind rose for the South Shore Met Station in Oregon, created using Highcharts

gRaphaël is another JavaScript charting library built on top of Raphaël (see next section). Like HighCharts, gRaphaël renders SVG graphics on modern browsers, falling back to VML for IE versions before 9. It’s not a very mature library with limited capabilities, few chart types, even fewer examples and pretty much non-existent documentation. It is, however, available under proper open source licences and could serve as a basis for great things.
g.raphaeljs.com

Special requirements and custom visualisations

If you want full control of the look, feel and interactivity of your charts, or if you want to create a custom data visualisation for the web from scratch, the out-of-the box libraries mentioned previously will not suffice. In fact you’ll be surprised how soon you run into limitations that force you to compromise on design. So if you want to take it up a notch and follow the lead of some of the wonderful and engaging data journalism happening at the likes of the New York Times and The Guardian, the tool for you will probably be one of the following:

Raphaël, gRaphaël’s big brother. It is a powerful JavaScript library which works with vector graphics. It renders SVG graphics for modern browsers and falls back to VML for Internet Explorer 6, 7 and 8. It comes with a range of good-looking samples and decent documentation. Raphaël is open source, and any developer should be able to hit the ground running with it. We don’t recommend Raphaël for the advanced charting part, but for entirely custom data visualisations or small data apps it may very well be the right tool for the task.
raphaeljs.com

raphael458

This polar clock was created using Raphaël. The lines (from outside in) represent seconds, minutes, hours, days and months

Protovis is an open source JavaScript visualisation toolkit. Rather than simply controlling at a low level the lines and areas that are to be drawn, Protovis allows the developer to specify how data should be encoded in marks – such as bars, dots and lines – to represent it. This approach allows inheritance and scales that enable a developer to construct custom charts types and layouts which can easily take in new data without the need to write any additional code. Protovis natively uses SVG to render graphics, but a couple of efforts have been made to enable VML rendering, making it an option for older versions of Internet Explorer that still account for a significant proportion of traffic on the web.
mbostock.github.com/protovis

Protovis

A Protovis streamgraph, described as a generalisation of stacked area graphs where the baseline is free. See more examples at bit.ly/czvYSB

D3.js is in many ways the successor to Protovis, building on many of the same concepts. The main difference is that instead of having an intermediate representation that separates the rendering of the SVG (or HTML) from the programming interface, D3 binds the data directly to the DOM representation. If you don’t understand what that means – don’t worry, you don’t have to. But it has a couple of consequences that may or may not make D3 more attractive to your needs. The first one is that it makes rendering faster therefore animations and smooth transitions become more feasible. The second is that it will only work on browsers that support SVG, so you will be leaving behind users of Internet Explorer versions 7 and 8.
d3js.org

dthree

A chord diagram, produced in D3. These show directed relationships among a group of entities

After thorough research of the available options, we chose Protovis as the base for building out DataMarket’s visualisation capabilities – with an eye on D3 as our future solution when modern browsers finally saturate the market. We see that happening about two years from now.

Hjalmar Gislason is the founder of DataMarket.com

0 Comments