## Dataviz is good at comparisons if we make the right comparisons

##### Jul 19, 2022

In an article about gas prices around the world, the Washington Post uses the following bar chart (link):

There are a few wrinkles in this one compared to the most generic bar chart one can produce:

(The numbers on my chart are not the same as Washington Post's. That's because the data vendor charges for data, except for the most recent week. So, my data is from a different week.)

The gas prices are not expressed in dollars but a transformation turns prices into a cost-effectiveness metric: miles per dollar, or more precisely, miles per \$40 dollars of gas. The metric has a reverse direction - the higher the price, the lower the miles. The data transformation belongs to the D corner of the Trifecta Checkup framework (link). Depending on how one poses the Q(uestion) of the chart, the shift from dollars to miles can bring the Q and the D in sync.

In the V(isual) corner, the designer embellishes the bars. A car icon is placed at the tip of each bar while the bar itself is turned into a wavy path, symbolizing a dirt path. The driving metaphor is in full play. In fact, the video makes the most out of it. There is no doubt that the embellishment has turned a mere scientific presentation into a form of entertainment.

***

Did the embellishment harm visual clarity? For the most part, no.

The worst it can get is when they compared U.S. and India/South Africa:

The left column shows the original charts from the article. In  both charts, the two cars are so close together that it is impossible to learn the scale of the difference. The amount of difference is a fraction of the width of a car icon.

The right column shows the "self-sufficiency test". Imagine the data labels are not on the chart. What we learn is that if we wanted to know how big of a gap is between the two countries, when reading the charts on the left, we are relying on the data labels, not the visual elements. On the right side, if we really want to learn the gaps, we have to look through the car icons to find the tips of the bars!

This discussion does not necessarily doom the appealing chart. If the message one wants to send with the India/South Afrcia charts is that there is negligible difference between them, then it is not crucial to present the precise differences in prices.

***

The real problem with this dataviz is in the D corner. Comparing countries is hard.

As shown above, by the miles per \$40 spend metric, U.S. and India are rated essentially the same. So is the average American and the average Indian suffering equally?

Far from it. The clue comes from the aggregate chart, in which countries are divided into three tiers: high income, upper middle income and lower middle income. The U.S. belongs to the high-income tier while India falls into the lower-middle-income tier.

The cost of living in India is much lower than in the US. Forty dollars is a much bigger chunk of an Indian paycheck than an American one.

To adjust for cost of living, economists use a PPP (purchasing power parity) value. The following chart shows the difference:

The right graph contains cost-of-living adjustments. It shows a completely different picture. Nominally (left chart), the price of gas in about the same in dollar terms between U.S. and India. In terms of cost of living, gas is actually 5 times more expensive in India. Thus, the adjusted miles per \$40 gas number is much smaller for India than the unadjusted. (Because PPP is relative to U.S. prices, the U.S. numbers are not affected.)

PPP is not the end-all here. According to the Economic Times (India), only 22 out of 1,000 Indians own cars, compared to 980 out of 1,000 Americans. Think about the implication of using any statistic that averages the entire population!

***

Why is gas more expensive in California than the U.S. average? The talking point I keep hearing is environmental regulations. Gas prices may be higher in Europe for a similar reason. Residents in those places may be willing to pay higher prices because they get satisfaction from playing their part in preserving the planet for future generations.

The footnote discloses this not-trivial issue.

When converting from dollars per gallon/liter into miles per \$40, we need data on miles per gallon/liter. Americans notoriously drive cars (trucks, SUVs, etc.) that have much lower mileage than those driven by other countries. However, this factor is artificially removed by assuming the same car with 32 mpg on all countries. A quick hop to the BTS website tells us that the average mpg of American cars is a third of that assumption. [See note below.]

Ignoring cross-country comparisons for the time being, the true number for U.S. is not 247 miles per \$40 spent on gas as claimed. It is a third of that value: 82 miles per \$40 spent.

It's tough to find data on fuel economy of all passenger cars, not just new passenger cars. I found Australia's number, which is 21 mpg. So this brings the miles per \$40 number down from about 230 to 115. These are not small adjustments.

Washington Post's analysis paints a simplistic picture that presupposes that price is the only thing people care about. I call this issue xyopia. It's when the analyst frames the problem as factor x explaining outcome y, and when factor x is not the only, and frequently not even the most important, factor affecting y.

More on xyopia.

More discussion of Washington Post graphics.

[P.S. 7-25-2022. Reader Cody Curtis pointed out in the comments that the Bureau of Transportation Statistics report was using km/liter as units, not miles per gallon. The 10 km/liter number for average cars is roughly 23 mpg. I'll leave the text as is in the post as the larger point is valid: that there is variation in average fuel economy between nations - partly due to environemental regulation and consumer behavior - and thus, a proper comparison requires adjusting for this factor.]

You can follow this conversation by subscribing to the comment feed for this post.

Just wondering if the international comparisons of fuel consumption take into account that some countries (eg the UK) use Imperial Gallons (=4.546 litres) rather than US Gallons (=3.785 liters). Of course, most countries use the inverted (and more sensible) lit/100km measure of fuel consumption rather than mpg (though Australia sometimes uses km/lit...)
Nice piece as ever, though!

TC: You're right that's another hurdle when dealing with this dataset. When I translated the data, I explicitly asked for US gallons. Most other countries I believe use the L/100Km as you indicated.

If the takeaway from the chart is that Americans should quit their bitchin', then it's very effective and accurate. The list of quibbles is long but not of much import. This chart might also illustrate the strength in the Dollar as opposed to the price of gas. I could also point out that PPP makes the logic somewhat circular since energy prices are a big input to PPP computations. What is the price of gas adjusted for the difference in gas prices? We might also compare the cost of per capita passenger miles driven divided by median income. Etc. etc. I would just leave the Wapo chart as it is.

" A quick hop to the BTS website tells us that the average mpg of American cars is a third of that assumption."

For some reason, the Bureau of Transportation Statistics uses km/liter instead of the American standard miles/gallon. A bit of factor conversions shows that 16 km/liter (for passenger cars) is the equivalent of 38 miles/gallon.

CC: Thanks for pointing that out. It goes to show that you can't make any assumptions when looking at data! It's hard to believe that they used km/liter on that report. I'm not using 16 kim/liter as that is for *new* cars. For average cars, it's about 10 km/liter which is about 23 mpg.

Why bother with the transform into "miles" if you're going to assume a spherical car at 32mpg? What benefit does that provide over just comparing the price per gallon?Why bother with the transform into "miles" if you're going to assume a spherical car at 32mpg? What benefit does that provide over just comparing the price per gallon?

One further possible improvement is to produce a table that takes median income into consideration: e.g. how long does the median worker have to work to drive the (locally) median vehicle paying hte local fuel price? One could even take cost-of-living levels into considerations, and work with disposable income.

We have had this debate here in Sweden as well, and I have seen graphs that show that while fuels prices in have increased, so has salaries while fuel consumption has decreased: the net effect is that very little has happened when one takes all of these into consideration. But in an election year...

The comments to this entry are closed.