A Harvard mess 1
Jan 08, 2009
There are some innocent-looking charts that throw up more and more problems, the more you look at it. This example comes from a magazine sent to Harvard alumni. We have all heard that their endowment fund suffered some horrific losses in the last few months and so the magazine editor thought it useful to describe the potential impact on the different departments.
It's a safe bet our readers would not think to present two related data series as a combination of one bar chart and one column chart.
As the chart stands, the intended message is completely lost. It takes a bit of fishing to learn that the Radcliffe Institute has a tiny stake in the endowment fund but supports over 80% of its operating funds from the endowment.
Now looking a little deeper, we find that the scales of the two charts are not standardized so the length of a bar and the length of a column cannot be directly compared. Nor can the grid-lines as each section accounts for 10% in the bar chart but 20% in the column chart (to make it worse, the larger section represents the smaller percentage!)
Looking further still, we find that "Other" accounts for some 15% of the endowment but apparently consists of entities that do not have operating funds and thus goes missing in the column chart. In our versions below, we will ignore the "Other" category completely; this is equivalent to assuming that we have allocated to the individual schools a proportional share of "Other"'s share.
Not to mention arranging the schools by alphabetical order.
Much of this mess became apparent when we put the two charts into a uniform setting, like this:
A scatter plot provides good information, especially if there is clustering although we can debate whether it is fit for general publication.
More in our next post.
Reference: "The Endowment: Each school's stake", Harvard Magazine, Jan-Feb 2009.
PS. The initial post switched the axis labels on the two bar charts. Thanks to Jon for pointing this out.
Looks like you've swapped the data from the two original charts in your bar chart reinterpretations.
I don't see why having a common scale is important here, since the two charts show qualitatively different kinds of information: one chart measures % of the whole, while the other has a metric that is independent for each school. That this metric is also expressed as a percentage seems beside the point to me; using a common scale suggests a relationship where there isn't one.
Also, it might read better to sort by % of operations funded by endowment, since there's more interesting variation in that series. Or alternatively, the % of total endowment chart could be expressed in endowment dollars, allowing us to use a log axis to fit the wide range of data.
Posted by: Jon | Jan 09, 2009 at 10:33 AM
A very good first cut at straightening out the mess. I didn't even notice the bar chart mislabeling, because I went right to the XY chart.
Posted by: Jon Peltier | Jan 09, 2009 at 01:32 PM
Jon - thanks for noticing the mislabeling. I have fixed the problem now.
Agree with your point about the scale and the metrics. That's why I put up a second post to talk about their choice of metrics.
Sorting is always an issue when two data series are displayed. There can only be one order and you can only make half the people happy.
Posted by: junkcharts | Jan 11, 2009 at 11:29 PM