The direction of time’s arrow

Once again, the target of our arrow of criticism is the estimable New York Times, and their estimable Charles M Blow, whose op-ed contributions are always interesting but almost equally often decorated with sadly defective graphics. In this example, we have a graph that is wrong in at least five ways. Can you spot them? Here is the graph.

The subject of the graph is the change in approval rating of President Obama following the killing of Osama bin Laden, for various selected groups. It is certainly possible to extract the information for any given group from the chart, especially because the artist kindly prints all the numbers, but in this regard it is little better than a table. And a graph should be more than a table, it should use your native perception of form to make a point.

The first error is the use of space. As is often the case with Mr Blow, the graph occupies a remarkable amount of vertical space, considering the modest data it contains. For this reason, you may have to expand the graph just to be able read its contents. As we will show, these data can be plotted in much less space, with an increase in clarity.

The second thing that is wrong with the chart is the selection of colors. Since before and after are depicted with color, we would like a strong contrast between the two. Instead we get a weak difference in brightness and saturation of two greens. Quick, tell me whether any subgroup showed a decrease in approval! I suspect you had to scrutinize each pair of bars, carefully ensuring that the darker one was shorter.

The third thing that is wrong is that the bar depicting “after” is about twice as wide as the “before” bar. Thus the area of the “after” bar is much larger, even if there were no change in approval. This is potentially confusing, ad certainly biased against the before figure.

The fourth thing that is wrong is that the bars are overlapping. This makes it harder to see the length of the “before” bar.

The fifth thing that is wrong is that the graph fails to exploit our native sense of how to depict an increase over time. By convention, in graphs time is always shown as proceeding from left to right. And positive quantities are always shown as increasing from bottom to top. The horizontal arrangement of the chart, and the overlapping of the before and after bars, fails to observe either of these conventions.

Another way to be absolutely sure that the viewer understands the direction of time is to actually show it as an arrow. This is especially appropriate when only two points in time are involved.

Correcting all of these errors, we produce the following chart.

While this chart should require no explanation, I will make a few comments on design. First, unlike the New York Times, I do not have an army of graphologists to tweak my product to perfection. This is a first draft, created in a couple of minutes, and could doubtless be improved. But it clearly shows that every group showed an increase, and the relative size of each increase. In each bar, time goes left to right, and approval increases from bottom to top, just as we expect. The arrows reinforce each trend with a strong graphic element, while the single green bar shows the absolute values of approval, and ties each arrow to its group name. We omit the actual numbers, but provide a 50% line for guidance.

My chart makes all the essential points, and does so in a way that is immediate and transparent. Mr. Blows chart has a certain graphical panache to it, and that is not a bad thing. But panache should never replace clarity.

Reference:

 New York Times
The Bin Laden Bounce
By CHARLES M. BLOW
Published: May 6, 2011
http://www.nytimes.com/2011/05/07/opinion/07blow.html
Advertisements

Show me the correlation!

Suppose that we have two quantities that vary over time. We want to know if their variations are correlated, that is, do the dance to the same tune, or does each march to a different drummer. Of course, we often ask this question because we want to know whether one of the two causes, or at least influences, the other. The most effective way to compare the variations of two quantities is to plot them one against the other.

But sometimes the simplest solution is not good enough for the pop graphologist. (A new term I just invented.) Take Charles M. Blow, the top pop graphologist for the New York Times. (I have commented on his work previously http://wp.me/p19RFk-a. ) Mr. Blow writes excellent columns, on important issues, but he decorates them with grievously flawed graphs. Consider the following graph, from the New York Times of April 16, 2011. Please forgive the gargantuan vertical extent of the graph (you will probably have to click on the graph to see the full extent), we will address this below.

The red line shows the variation over time of the top marginal tax rate in the US. The tiny green bars at the bottom show the % change in GDP from the previous year. The first question we always ask of any graph is: what pops out at you? I thought so: nothing. Now maybe that is Mr. Blow’s point – that there is no relation between the two quantities – but if so, this is hardly the way to show it. Partly because it is the wrong type of graph, and partly because the two quantities are so far apart on the graph, it is difficult to discern any relationship between the two quantities.

As we stated at the outset, the simplest way to graphically display a synchronicity, or in technical terms a correlation, between the two quantities is to plot them against each other. I have done this in the graph below. Using the same data as Mr. Blow, we plot the % change in GDP against the top marginal tax rate.

Now we can immediately discern whether there is a close relationship between the top rate and the changes in GDP. If there were, the points would cluster tightly together, forming some curve describing the relationship. Instead, the points are widely scattered, and so the main point of the article – that there is no strong relationship – is verified.

However! However! Plotting the data in this way actually reveals an additional surprise, completely obscured in Mr. Blow’s graph. There is actually a small POSITIVE relationship between the two quantities! Yes, you heard me right. In yet another death knell for supply-side economics, we see that HIGHER tax rates lead to LARGER increases in GDP. (Forgive my upper-case outburst, I got a bit excited). Note that the two lowest marginal rates are associated with some of  the largest declines in GDP, and two of the largest increases in GDP are associated with marginal rates near 90%! The red line shows the best-fitting linear relationship between the two quantities, and it climbs slightly as you go from low tax rates to high. True, the effect is weak. The slope of the line is slight; it takes a 17% rise in top marginal rate to get 1% rise in GDP. And the degree to which the points cluster around this trend is also weak. We measure this by the correlation statistic (http://en.wikipedia.org/wiki/Correlation_and_dependence), which must lie between -1 and 1, and in this case is a meager 0.26 (0 would mean no relationship at all).

But still. Mr Blow could have made a much stronger case if he had used the right kind of graph.

Now we are going to make a few nerdy points about graphs of correlation, and those of you who are only here for the entertainment portion of the show can go back to your other amusements.

In graphs of this sort, it is traditional to put the so-called “independent” variable on the horizontal axis, and the so-called “dependent” variable on the vertical axis. When we assign variables in this way, we are making an assumption about what causes what. That may or may not be reasonable, but it is good to adhere to this convention. In this case, the question addressed in the column is whether tax rates affect growth in GDP, so that is why we assign the quantities to the two axes as we have.

Another feature of a graph like this is the aspect ratio. One failing of Mr. Blow’s graph was the large vertical distance between the separate graphs of the two quantities. The distance was so large because Mr. Blow chose to plot them on the same axis (%). This may have seemed reasonable, since a % is a %, but in fact it is mistake. When we are exploring the relationship between variation in two quantities we should not presume that we know the ratio between them. It is better to let the data tell us what that ratio might be. The correlation graph does this by plotting the full range of one against the full range of the other. Because there is no reason to do otherwise, we make the two ranges the same size. In other words, the graph has an aspect ratio of one.

As I noted above, Mr Blow’s graph has an enormous vertical extent. It is so big that in native form it will not fit on a typical laptop screen without scrolling. OK, now, for extra credit, tell me the reason for using such a large vertical expansion of the graph? Take your time…no-one is timing you…plenty of time…all the time in the world. Not quite done thinking? Take a few more minutes. Ready? And the answer is…there is no reason whatsoever! Mr. Blow blows up his graphs to ridiculous proportions (usually in the vertical dimension) because he can. He is the big graph honcho at the New York Times!

Now I hesitate to make Mr Blow the poster child for bad graphs, since the intellectual points he makes are always good ones. But an intervention is required, for his sake, for the New York Time’s sake, and for the sake of the reading public.

Reference:

New York Times

The Pirates of Capitol Hill

By CHARLES M. BLOW

Published: April 15, 2011

Data at:

http://www.bea.gov/national/xls/gdpchg.xls
http://www.taxpolicycenter.org/taxfacts/displayafact.cfm?Docid=213

Pie is a continued fraction

The pie chart is a venerable and effective way of showing how some total, say the federal budget, is divided up into its constituent elements, each represented by a “slice” of appropriate angular size. Of course, often these slices will change over time, and it is tempting to portray that trend in a series of pie charts. To paraphrase Richard Nixon, “you could do that, but it would be wrong.”

It would be wrong because it fails to make trends in the data effortlessly and immediately evident to the viewer, because it fails to exploit the human visual systems automated mechanisms for perceiving trends. As we have noted with tiresome regularity, the eye is tuned to see contours, and to judge their orientation (look! the market is going up!) but not to quickly judge areas of complex shapes depicted in separate, unconnected parts of a figure.

Here is an example, taken from a New York Times article on how medical device companies bribe doctors to use their products, rather than their competitors, irrespective of the value to the patient. (This practice would be outlawed under ObamaCare, but perfectly ok under BoehnerCare). The series of pie charts attempts to show how one company (Biotronik) rapidly achieved near-complete market dominance for its pacemaker at one Nevada Hospital, after paying the hospitals cardiologists for “consulting.”

As always, I ask you to take a quick look at the chart, and see what pops out at you. I think the sad answer is: nothing. It requires careful scrutiny, with endless searching back and forth between pies, and between labels and slices of pies, until the presumed point is made: Biotronik went up, suddenly, and everyone else went down, suddenly, to almost nothing. And in fact, everyone else is primarily one company: Boston Scientific. So the point is equally well made by just showing the two companies.

That is what we have done in the following graph.

The trends in the two companies fortunes is perceived immediately and effortlessly. And because the graph shows percent market share, and Biotronik is almost at 100%, it is clear they have achieved a near monopoly. There is no need to plot “Other.”

The use of filling in this chart (coloring in the areas below each line) is a judgement call. While filling uses more ink, it can convey a contour better than a line. And since this graph is showing market share, it feels appropriate.

This graph could have been shown in color, but since there are only two categories, and since their trends are so clear, there is no need. Nonetheless we show an example here. As we note below, when many categories are involved, it is helpful to use color as a linking device.

One of the problems in attempting to show trends over a series of pie charts is that the categories within the several charts must be linked. To use the current graph as an example, we need to know which slice in each pie belongs to “Biotronik.” In the Times graphic, two strategies are used to link categories: shades of gray, and text labels.

The use of shades of gray to link the categories in the four pies is particularly weak. As any vision scientist will tell you, the human eye is very bad at remembering or identifying particular shades of gray. You have to remember, because you have to move your eye from one pie to the next. Colors, such as red or blue, suffer from no such weakness. We say they are perceived “categorically.”

Text is also a poor way of identifying the categories. It is unambiguous, to be sure, but requires reading, and moving the eyes back and forth from the label to the slice, all of which disrupts what should be an immediate, effortless “grokking” of the categories.

Consider also the additional clutter introduced by all of the labels. The words “Boston Scientific” are printed out in full three times, as are the words “Biotronik,” while the word “Other” gets repeated four times. The last is particularly ironic, since the category “Other” contributes almost nothing to the discussion.

It might be argued that the timeline, and paragraphs of text, that float above the graph provide historical context. In this case, not much context is really required. In any case, the graph that we have provided can be stretched to suit, or the text, which is really supplementary material, could be attached to the graph through arrows marking significant events in the chronology.

In the present example, there are really only two categories of interest. But often there will be more, in which case another method of plotting might be considered. This is the so-called “stacked,” or as I prefer, “accumulated” graph. In this variant, we add the values of each series on top of each other, so the data for each category is represented by the vertical extent that it occupies. Here is our current example, rendered in this way.

This chart has the advantage that, like a pie chart, it slices up the total into all of its constituents. The share of each category is immediately evident by the share of the vertical height it occupies. But in contrast to the series of pie charts, the trend is immediately and effortlessly evident, because the “slices” are connected.

There are problems with this form of presentation. One must choose the order in which to place the elements, top to bottom. There is no correct order, and this introduces an opportunity for bias or inadvertent misrepresentation. For example here, it seems natural to put the “Other” category at the bottom, but what about the other two? In the example above, placing Biotronik in the middle causes the upper edge of its share to climb precipitously, which matches its growth in share. But this has the unfortunate consequence of causing the lower edge of Boston Scientific’s share to also climb, a visual cue that is contrary to its declining share. Plotting in the reverse order, as shown below, merely reverses the problem.

So caution should be used when employing stacked graphs, and only use them when there are more than two significant categories, and the data cannot be better shown with simple line plots as in our first figure above.

The lesson here is: don’t use a series of pie charts to show a trend. It doesn’t work. A line graph is always better.

A subsidiary lesson is to use caution when using stacked graphs. The arbitrary order of stacking can convey different impressions.

Reference:

New York Times

Tipping the Odds for a Maker of Heart Implants By BARRY MEIER Published: April 2, 2011

http://www.nytimes.com/2011/04/03/health/03implant.html

Letting it all hang out

Sometimes, when you look at a graphic, you can sense the frustration of the artist. There is, after all, rarely one best way to plot a set of multidimensional data, and sometimes the artist gives up and tries to show everything. The result is invariably a mess. Consider the graphic below, from the New York Times, in an article comparing economic growth and progress in health in various global regions. The main thesis is that there is a disconnect between the two: improvements in health may occur in the absence of significant economic progress. But the artist has chosen to present the data as a giant smörgåsbord of options, in a table in which four columns show life expectancy and GDP in four different ways. First there is a little graph showing growth in life expectancy over a decade, then the total gain is shown as number, then (for some inexplicable reason) the growth in GDP is shown by two disks, and finally, the total GDP growth is shown as percentage. As always, I ask the central question: “What jumps out at you?” Here, sadly, the answer is: nothing.

Consider instead the graph below. Here we have made some hard choices. We have dispensed with the first and third columns, and plotted the gain in life expectancy against the percentage gain in GDP. We show the final GDP by the area of the points. This graph makes the essential point quickly and efficiently: there is no obvious correlation between GDP and gains in life expectancy. The outliers, regions with big gains in life expectancy but little growth in GDP, such as Latin America and the Middle East, jump out from the pack, as do regions with enormous economic growth, but little change in life expectancy.

The lesson here is: make the hard choices; less is more. Show only the data you need to make your point.

Reference:

New York Times, “Hopeful Message About the World’s Poorest” By DAVID LEONHARDT

Published: March 22, 2011

Permalink: http://www.nytimes.com/2011/03/23/business/economy/23leonhardt.html

We did it! (again)

Anyone living in the current millennium who is even moderately engaged in the politics of our time receives a daily onslaught of political email. Most of this is designed to stoke our outrage at the latest unspeakable act by the other side, and beseech us to send just a few dollars to answer this assault on all that is good. And sometimes the messages exhort us to send our money so that we can show that our side is collecting the most money, and must therefore be most in tune with the public. And if we have done as we are told, at the end of the fundraising cycle, we may get a message with the exhultant cry: “We won!” As if the real battle was over fundraising, rather than the deficit, health care, two wars, or financial regulation.

These messages annoy me somewhat, but nothing sends me into apoplexy like a bad graph. But the latest missive had a doozy: a graph that should shame the artist into permanent retirement from the field of graphology.

Behold Exhibit A: a graph that purports to show the fundraising results for Democratic and Republican groups in the last fundraising cycle. And as the DCCC says: “We did it!”

The numbers tell one tale: evidently $4.4M for the dems and a measly $3.0M for their opponents.

But there is something a bit weird about this picture. The blue bit seems out of proportion. As a professional graphologist (don’t try this at home) I quickly noted one obvious possible source of distortion: the width of the bars. Why is the blue bar wider? Are they using area to represent the number of dollars? That would be ok, though I have my doubts about our ability to judge relative areas. But when I took out my digital ruler, I discovered to my amazement (not!) that the areas were not in the ratio of 4.4/3 = 1.47, but instead in the ratio 2.54! Yes that’s right, the blue rectangle has 2.54 times the area of the red rectangle. If area were accurately representing dollars, then the NRCC would only have raised only $1.73M, not $3.0M. The red area is off by 73%!

Ok, I thought, so the increased width of the blue rectangle is just some sort of rhetorical flourish. It must be the height that represents the two quantities. Again, amazement overcame me. The heights are wrong too! The blue is 1.72 times higher than the red, rather than 1.47 as it should be.

In other words, the only thing the artist got right was that the DCCC got more dollars than the NRCC. For this, we need a graph?

Ok, so what would the correct graph look like? Assuming we use height to represent dollars, here it is. We put some numbers on the axis to make it clear that zero is included.

So, we come to the end of our sad tale. What is the lesson? Simple: if you are going to use a graph to represent numbers, make sure it actually represents the numbers.

Reference:

Email of March 8, 2011, from Rep. Steve Israel, DCCC Chairman <dccc@dccc.org>.

Beyond compare

Comparisons are the lifeblood of graphing. When we plot two points, the eye immediately judges their relative positions, and a conclusion is drawn. Usiually, of course, we have more than two points, and more elaborate conclusions may be drawn about trends, reversals, and the like. And sometimes we want to compare two (or more) complete graphs. A reasonable desire, but one fraught with peril.

Here is a recent graph from the estimable New York Times economic columnist David Leonhardt. It compares the recent economic performance, represented by the Gross Domestic Product, of three nations: US, Germany, and the UK. The point of the article is that austerity, as practiced in Germany, has caused that nation’s recovery to lag behind the US, which has pursued a more stimulus-driven approach. The primary evidence given is the rightmost points in the graph: at the end of 2010  Germany appears not to have recovered to the same point as the US.

To allow for easier comparisons, we re-create this graph, using data obtained from http://stats.oecd.org.

Notice that the graphs are compared by indexing them to a single point: the first quarter of 2008. This might seem reasonable; after all, that was the peak, or near peak, for all three economies, and could be taken as the starting point for our recent travel through the valley of (economic) death. And the graphs have to be indexed, or somehow converted to a common unit, since the absolute size of the economies are very different.

But there is a serious problem with picking one point on the graph for indexing. Note that the germany economy had a spike just in that quarter, while the US economy showed a dip. Using that point for indexing favors the US in subsequent comparisons.

To make this point more concrete, suppose we chose the previous quarter (the last quarter of 2007), for indexing? That is also reasonable, since it marked the point at which US growth stalled. Here is the resulting graph.

Woah! All of a sudden, Germany is right up there with the US in terms of its recovery. Both nations are back to 100% of the value in the third quarter of 2007. Not also that this matching of US and german economies seems more reasonable than the previous, since the two curves, rather than a single point, are better matched.

But this leads us to note that an altogether more reasonable way of matching several curves, that avoids the transient blips of one nations fortunes, is to match the entire curves as well as possible. This amounts to matching their average values. We have done this in our next graph. Each curve is now plotted as a percentage of the average over the interval.

Again, we see that the US and Germany are effectively tied in their recovery. In this graph, even the UK doesn’t look quite as bad. The main difference between the US and Germany, viewed in this way, is in the magnitude of their deviations: Germany has swung more wildly from boom to bust.

Even this equitable way of comparing curves has a problem: it depends on the interval over which we average. It would (probably) make no sense to allow economic patterns from a century ago to influence a comparison of recent trajectory of US, Germany, and the UK. So the average should be local, restricted to values in the neighborhood of the relevant data. But how local? Alas, there is no fixed answer. It depends. And that means that it is variable that can be manipulated to make a point.

In summary, the student of graphs should always be vigilant for comparisons of several curves that require “indexing.”

References:

Why Budget Cuts Don’t Bring Prosperity By DAVID LEONHARDT

New York Times. Published: February 22, 2011

http://www.nytimes.com/2011/02/23/business/economy/23leonhardt.html?_r=1&scp=2&sq=leonhardt&st=cse

My data from Organisation for Economic Co-operation and Development (OECD): “VPVOBARSA: Millions of US dollars, volume estimates, fixed PPPs, OECD reference year, annual levels, seasonally adjusted.” from http://stats.oecd.org/

Turn the tables

Today’s note is about the very worst sort of graph: a table. Of course, a table is not a graph, but that is the point: it almost always should be. Here is a table that appeared recently to make a very simple and powerful point. To further my argument, however, I won’t tell you what that point is, and ask you to deduce it from the table. I will time you. ready? Go!

Ok. Stop! Time’s up. Did you get the point? I thought not.

Now here are the same data as a graph. Now do you get the point? The graph shows changes in the Gross Federal Deficit, as a percent of Gross Domestic Product, for the last nine presidential administrations. Above the zero line are increases in the deficit (bad). Most of the colored areas above the line (bad) are red (republican) and most of those below the line (good) are blue (democrat). So much for the notion of “tax and spend” democrats and “fiscally responsible” republicans!

But of course this column is scrupulously fair and balanced, so our point here is not the political one, but the graphical one.

The main lesson here is a very elementary one. Never use a table when a graph is possible. Graphs by their nature render data into visible patterns that jump out at the reader. Tables have a purpose – to provide access to numerical values – but they rarely make anything self evident, and they rarely show trends without scrutiny, thought, and calculation.

It may be worth mentioning some of the design decisions that went into my graph.

First, note that there is no horizontal axis. We leave out the dates because they add little to the story. The rectangles are the correct width (4 or eight years), and are in the correct order. adding dates would just be clutter.

Second, we used a rectangle graph, in which the width of each rectangle represents the length of each presidential term. And the vertical axis is deficit change per year. So the area of each rectangle represents the total change during each administration. This is appropriate, since the visual impact is proportional to area, and the total deficit from each administration is the key quantity we wish to convey.

We add the names of the Presidents, since that is of some interest, but put them in light gray, since their identities are not central to the point being made.

We add a small key, to remind readers of the meaning of the two colors. While these colors have become conventional for the two parties, we can’t assume the key will be obvious.

We leave out any extraneous lines, text, shading, or decoration. Just the facts, ma’am.

Reference:  The Atlantic, January 2, 2011.

http://www.theatlantic.com/politics/archive/2010/11/where-did-our-debt-come-from/66530/

%d bloggers like this: