The state of charting software
Highway Safety Agency goes rogue

Bad charts can happen to good people

I shouldn't be surprised by this. No sooner did I sing the praise of Significance magazine (link) than a reader sent me to some charts that are not deserving of their standard.

Here is one such chart (link):

Quite a few problems crop up here. The most hurtful is that the context of the chart is left to the text. If you read the paragraph above, you'll learn that the data represents only a select group of institutions known as the Russell Group; and in particular, Cambridge University was omitted because "it did not provide data in 2005". That omission is a curious decision as the designer weighs one missing year against one missing institution (and a mighty important one at that). This issue is easily fixed by a few choice words.

You will also learn from the text that the author's primary message is that among the elite institutions, little if any improvement has been observed in the enrollment of (disadvantaged) students from "low participation areas". This chart draws our attention to the tangle of up and down segments, giving us the impression that the data is too complicated to extract a clear message.

The decision to use 21 colors for 21 schools is baffling as surely no one can make out which line is which school. A good tip-off that you have the wrong chart type is the fact that you need more than say three or four colors.

The order of institutions listed in the legend is approximately reverse of their appearance in the chart. If software can be "intelligent", I'd hope that it could automatically sort the order of legend entries.

If the whitespace were removed (I'm talking about the space between 0% and 2.25% and between 8% and 10%), the lines could be more spread out, and perhaps labels can be placed next to the vertical axes to simplify the presentation. I'd also delete "Univ." with abandon.

The author concludes that nothing has changed among the Russell Group. Here is the untangled version of the same chart. The schools are ordered by their "inclusiveness" from left to right.


This is a case where the "average" obscures a lot of differences between institutions and even within institutions from year to year (witness LSE).

In addition, I see a negative reputation effect, with the proportion of students from low-participation areas decreasing with increasing reputation. I'm basing this on name recognition. Perhaps UK readers can confirm if this is correct. If correct, it's a big miss in terms of interesting features in this dataset.




Feed You can follow this conversation by subscribing to the comment feed for this post.


I wonder if a slopegraph might work even better with just two data points per school: 2005 and 2011. In the same way you applied generous smoothing here, do we really care about all the noise between 2005 and 2011?


Phillip: good suggestion. Instead of a slopegraph, I'd fit individual regressions to each school (this could be one regression with school effect added). It would take careful analysis though because if you look at York, or Queen Mary, or LSE, I'd say it's hard for a straight line to do the data justice. A longer time series may help. Also knowing the enrollment of each school will help.

Nick Redfern

Arranging the graph in this way also gives a really good indication of participation by geography, with the most inclusive schools on the left of the graph in the north of England and the most exclusive in the south. This is probably more relevant than reputation, because even if you haven't heard of a particular institution you will have heard of Liverpool (home of the Beatles), Manchester (home of Man. United and The Smiths), or Sheffield (where The Full Monty was set). This cultural recognition is often an important factor for students choosing a university, especially for overseas students.


In Australia we are undergoing a certain amount of deregulation, allowing increased domestic student numbers,so we are going to see similar effects.

When a top level university increases their student numbers, they can recruit from almost anywhere in their state. Basically they take the best students, who previously would have just missed out. These students don't come from low-participation areas, they would previously been the top students at lower ranked universities.

When a second-tier or lower university increases student numbers they have to recruit fairly much from their local area. The only way they can do that is to recruit students who otherwise wouldn't have attended university, so these come from low-participation areas. This also means a decrease in standards, as these students are usually not as capable. It does mean more money from the government which means those universities can use it to subsidise their research activities.


An irrelevant comment:
page 13, 'Household saving' or cubist reindeers?

Philip Howard

I don't think the charts given here are much improvement either. The assertion is that access to elite schools has not improved, so let us examine that, rather than deciding to lob line charts of one kind or another at the screen.

We need to contrast earlier access with later access, show any common trends or outliers, and give some idea of how access and its change relate to exclusiveness. So make one axis the eliteness of the university (e.g. rank from league tables). Next we need a measure of access; since access per year varies, let's show the access as a delta to the overall access for each year. But we are trying to show both level and change, so how about a slopegraph as Phillip suggested for each from 2005 to 2011, but I suggest also indicating whether the movement is significant; LSE's access varies wildly and a simple slope would not do it justice. Perhaps a fit line through the points, with some shading indicating the bounds of the points.

Alternatively, get hold of the ranking data for each year, and plot rank vs access as a path for each institution, picking out Oxford, Cambridge and any outliers, and using tapering or arrows to indicate the motion over time.

The point is to pick the question to ask, and make sure that at a glance you can answer it. At a glance, the new chart above just tells me that there are more and less accessible universities; the ranking puts absolute accessibility uppermost, which has not changed so significantly at any university to obscure a trend down and to the right, which is just the ranking. In 5 seconds, look at the chart and tell me whether the most elite universities have improved access. A clue: their ranking is not even present on the graph.

The comments to this entry are closed.