A dimension is a terrible thing to waste

Graphs consist of ink spread out over two spatial dimensions. Sometimes the ink is colored, sometimes it is electronic ink, but the point is the same: to use the innate two-dimensional pattern-seeking machinery of human vision to expose some pattern in the data. Because we have only two spatial dimensions, each is highly valuable, and should not be squandered.

Here is a recent graph showing salaries in various life science specializations in the years 2009 and 2010. The graph consists of various vertical bars, each for one specialization in one year, with a height proportional to salary.

The various bars are arrayed along the horizontal dimension. First question: what is the meaning of the horizontal dimension? Take your time. Think carefully. OK, give up? The answer is…nothing! The data are not sorted alphabetically by specialization, by rank in 2009, by rank in 2010, or by any other discernible criterion. This is an example of wasting a dimension. As noted above, spatial dimensions of a graph are extremely valuable, and should never be squandered. They are the levers that we use to lift your consciousness of trends or singularities in the data.

Here we could at least use the horizontal dimension to rank the various specializations, for example by 2009 income. This is illustrated in our cleaned up example shown below. Now we can quickly see the best compensated specializations in 2009. We have indicated the two years with connected colored lines rather than discrete bars. We have labeled the specializations, but in a muted gray so that they can be read but do not distract from the data.

The original graph also has a number of other features worthy of ridicule. What are those little blue flags flying everywhere? Tibetan prayer flags? No, they are labels that tell us the actual numerical value of each salary. Excuse me, if we wanted a table we would have asked for a table. The point of a graph is to let the story be told by shape and position, not by a bunch of numbers. Apart from being unnecessary, the blue tags add clutter, one of the great villains of graph design. Here clutter masks the vital information in the graph, and obscures the important contour: the tops of the bars.

Another odd feature of the graph are the arrowheads at the top or bottom of the bars for 2010. What are they for? Some study will reveal that they are telling us whether that specialization increased or decreased in 2010. Did that jump out at you? I thought not. And of course half the arrowheads are at the top, and half at the bottom, so the eye can never move smoothly across the set and perceive some pattern. In our replacement graph, whether salaries increased or decreased in 2010 is immediately obvious from the position of the two curves.

Other minor quibbles with this graph would be: 1) why is 2009 to the right of 2010? By convention, dates usually increase to the right. 2) Why are the labels in all caps? Is there some reason to shout? 3) The number 2010 in yellow is barely legible. Yellow text on a white background is always a bad idea.

Reference:

The ScientistLife Sciences Salary Survey, 2010,

http://www.the-scientist.com/fragments/salary_survey/2010/ss-charts2010.jsp

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: