November 16, 2013 Leave a comment
Occasionally I am asked to give a lecture on how to draw good graphs. While I am always tempted to drone on interminably about abstract principles such as minimalism, balance, and consistency, I have discovered that it is much more fun to criticize bad graphs, and to show how they can be improved. But how to quire a truly bad graph? Easy! Just use the defaults in Microsoft Excel!
Here is an actual example of a graph drawn with those defaults. The data are fictitious, but the ugliness is breathtakingly real. It is sometimes said (unfairly) that engineers lack all sense of graphical design, but I think they must have hired specialists to create something so painfully wrong. But what specifically is wrong, and how can we make it better? Michelangelo once said “I saw the angel in the marble and carved until I set him free.” So here too we will chip away at the obscuring excess, to reveal the beauty that Microsoft tried to hide.
First of all, what purpose is served by the heavy black rectangle that surrounds the graph? It serves two purposes: 1) to obscure useful information, and 2) to waste ink. Let’s remove it.
Better, but still bad. Next we note that the quantity being plotted is identified in three separate places: the vertical axis label, a title above the plot, and a key to the left. Is this really necessary? I think not. Lets get rid of two of them. Of course a key can be useful when several quantities are plotted together, but not when there is only one. Likewise labels above a plot have their uses, but should be avoided when they are redundant with other information, such as the axis label. We remove the key and the title. Apart from reducing clutter, this substantially increases the area available for the useful parts of the graph.
Now we ask the question: what purpose is served by the gray background? It serves two purposes: 1) to reduce the contrast and thus visibility of the data points, and 2) to waste ink. Get rid of it!
Aaah…so much more cheerful and relaxing to look at! But a few troubling questions remain. For example, what purpose is served by those shadows behind each data point? Do they indicate some exciting three dimensional aspect to the data? Of course not. But they do serve two purposes: 1) to render ambiguous the actual locations of the data points, and 2) to waste ink! Please people, can all just agree to never, never, use little shadows to suggest that our data are floating above the page? Thank you. The corrected graph is below. We have removed the shadows and also changed the diamonds to discs for the very important reasons that 1) they are simpler, and 2) I like them better.
Next we note that graphs are usually employed to show a pattern or trend. This pattern is not communicated well by a set of individual points floating out there, each an island, entire of itself. Only connect! A line drawn between the points aids enormously in conveying the visual sense of the data.
Next we correct an obvious (except to the Microsoft designers) flaw: the axis number labels running through the middle of the graph. We move them where they belong: to the axis, outside the graph.
Now we are getting somewhere. It almost looks ok. But we can do better. Gridlines can serve a purpose – for example, to let the reader easily judge approximate values – but there is never a reason for them to be dark and heavy, and to mask the useful information in the figure. Lighten up! In fact, the gridlines should generally be as light as possible, and still be visible. In this example, we make one gridline a bit darker than the others, to identify the y = 0 line.
Now we see that the data really stand out. But we can do better still. What remains to distract the eye from the data? Well we could try removing the gridlines altogether, and then there is no need for the top and right borders of the frame.
Next we ask: what is the purpose of the bold font on the axis labels? Of course, it is to waste ink. Using a bold font for your labels is like writing your emails in all upper case. It is the digital equivalent of shouting. Don’t do it. Use your indoor voice.
And finally (yes, finally) we can reduce the line weight of the remaining axes. All we really need is enough weight to see them, and note their positions.
Thus we arrive at our final graph. It is not particularly exciting, but the data are clear, the trends are evident, and there is little to distract the eye from the essential information. Clearly, not all graphs are this simple, and there are often reasonable justifications for more elaborate presentations. But it is often a good idea to start with the simplest possible presentation, and elaborate from there.
We conclude with the motto of this presentation, and indeed of this entire blog:
“Less ink, more think.”