A challenge
Sep 27, 2007
The Gelman blog has issued a challenge on how to present the following Venn diagram in a more comprehensible way. This one is pretty tough.
Antony Unwin sent in this entry:
« August 2007 | Main | October 2007 »
The Gelman blog has issued a challenge on how to present the following Venn diagram in a more comprehensible way. This one is pretty tough.
Antony Unwin sent in this entry:
As this report from the Department of Transportation makes clear, congestion on our roadways causes travellers to add "buffer time" to their planned journeys. So, for instance, one may have to allocate 32 minutes for a trip that would have taken 20 minutes in uncongested traffic if one would like to guarantee on-time arrival. The 12 minutes would either become time spent sitting on the road or wasted time due to arriving too early.
Buffer time can be applied to graphs too. Some graphs require readers to spend time fishing out the information. The chart used to illustrate travel time belongs to this category. The clock analogy fails; in fact, it confuses matters as the hour hand just sits there serving no purpose. The buffer time between staring and comprehending is too much!
Only four numbers underly this chart: travel time when uncongested and buffer time to guarantee on-time arrival, for 1982 and 2001. The following version gets to the point without fuss. It shows that the travel time increased significantly even under uncongested traffic; worse, the buffer time multiplied.
Reducing buffer time is always good but some buffer time may be inevitable. In the traffic analogy, to eliminate all buffer time would mean lots of unused capacity. In the context of graphs, more complicated charts would require more time; the key is whether the reader is rewarded for the time spent figuring out the chart.
Source: "Traffic Congestion and Reliability", Department of Transportation.
This chart from the NYT was intended to show how the EPA has moved the bar on vehicle mileage ratings: 2008 estimates were lower than 2007 estimates across the board, regardless of manufacturer, model and city/highway.
The chart was built from one basic component, repeated for each model. I like the discreet gridlines (the white ticks) which enable readers to count off the mileage ratings.
The data is rich: ratings were given along three dimensions (model, year of estimate and city/highway). Readers can benefit from a stronger guidance in where to look for the most pertinent information. As the chart stands, it is merely a container for the data. It fails our self-sufficiency test: all the data were printed on the chart, and the bars add little.
In the junkart version, I use knowledge of the data to structure the chart. First, noting that sedans, hybrids and trucks/SUVs/minvans have different levels of mileage ratings, I clustered the models into three groups. Secondly, the city and highway ratings were separated into two columns as I consider the between-model comparisons more important than city-highway comparisons. The chart is a dot plot, with a vertical tick for 2007 estimates and a dot for 2008 estimates. It's easy to see that all dots sit to the left of vertical ticks.
More subtly, we can also see that the hybrids appeared to have been penalized more. Or perhaps, the higher the rating, the larger the downward adjustment...
Source: "Mileage Ratings Are Still Estimates, Though Closer to Reality", New York Times, Sept 16 2007.
Graphs are indispensable if one is to make sense of large data sets. Kraig W. pointed us to some of the "bump charts" he made of the 2007 Tour de France, and indeed they are quite powerful. (Because of the amount of data, you'd need to see the pop-up image to make sense of it.)
As someone who only has cursory knowledge of the Tour, I learnt a lot from this graph alone. The chart traced the ranking of each rider through 20 stages in the competition.
Would it have been better to plot the "lag times from the leader" rather than ranks? Hard to say. Plotting time differentials will tell us more as ranks remove the magnitude information. However, it can cause the chart to look even more messy.
Graphs are efficient in transferring knowledge. Imagine having to stare at a large table of rankings instead!
Source: BikeTechReview.com, KDUBlog, July 30 2007.
At first, this looks like a decent chart despite the donut construct, which I cannot stand (but the Economist loves).
The accompanying text proclaimed: "Rock stars are famous for excess, and some pay the price". The rest of the paragraph points out drug- and alcohol-related deaths, plus deaths due to "unhealthy lifestyles", which apparently include cancer and cardiovascular disease.
There is a gaping hole between what's on the chart and what's in the text. They just talk past each other.
Charting is much more than just aesthetics. Some basic statistical common sense goes a long way. This was observed long ago by Huff.
Source: "Rock stars: live fast, die young", Economist, Sept 4 2007.