Knowledge transfer
Buffer time

Structuring a chart

Nytmpg This chart from the NYT was intended to show how the EPA has moved the bar on vehicle mileage ratings: 2008 estimates were lower than 2007 estimates across the board, regardless of manufacturer, model and city/highway.

The chart was built from one basic component, repeated for each model. 
Nytmpgsm_2I like the discreet gridlines (the white ticks) which enable readers to count off the mileage ratings.

The data is rich: ratings were given along three dimensions (model, year of estimate and city/highway).  Readers can benefit from a stronger guidance in where to look for the most pertinent information.  As the chart stands, it is merely a container for the data.  It fails our self-sufficiency test: all the data were printed on the chart, and the bars add little.

In the junkart version, I use knowledge of the data to structure the chart. First, noting that sedans, hybrids and trucks/SUVs/minvans have different levels of mileage ratings, I clustered the models into three groups.  Secondly, the city and highway ratings were separated into two columns as I consider the between-model comparisons more important than city-highway comparisons. 
RedompgThe chart is a dot plot, with a vertical tick for 2007 estimates and a dot for 2008 estimates.  It's easy to see that all dots sit to the left of vertical ticks.

More subtly, we can also see that the hybrids appeared to have been penalized more.  Or perhaps, the higher the rating, the larger the downward adjustment...

Source: "Mileage Ratings Are Still Estimates, Though Closer to Reality", New York Times, Sept 16 2007.


Feed You can follow this conversation by subscribing to the comment feed for this post.

Hadley Wickham

Why not sort the y-axis by mpg? Then the three groups would naturally fall out, and then wouldn't be the large "jumps" between the different groups.


Hadley: I was thinking from a consumer perspective, you're either in the market for a sedan or a truck. I'd have put clear dividers between the three groups.


It would be interesting to see this expressed and visualised as a percentage reduction. I guess this must be how they calculate the changes.


Jens, or plotted on an exponential scale, which will show up the same thing: whether there is a constant ratio between "before" and "after".

The exponential scale will have the advantage of not destroying the original values, as a percentage operation does.

(The mischeivous side of me wants to find the size of the fuel tanks in gallons, plot that as a log-log scatter graph, and draw diagonal lines for the nominal range of the cars on a single full tank :-)



I agree with Hadley. Why?

Because the whole point of having hybrid cars is that people will choose them over cars with high (and unknown future) running costs. So they are 'sedans' really.

If you want an ordering variable I would use the product of wheelbase & distance between front & back wheels. This is a measure of usable space for people or load into the vehicle. You might sqrt that product too of course.

If you sorted by mileage you could colour by vehicle 'group'.

I do not think miles per full tank will work well because manufacturers put bigger tanks in vehicles that use more petrol. Miles per $100 might work better.

Sorry, I am late to the commenting party...
Nice blog though. :-)


The comments to this entry are closed.