The unreasonable effect of chart labels
A data graphic that solves a consumer problem

How to describe really small chances

Reader Aleksander B. sent me to the following chart in the Daily Mail, with the note that "the usage of area/bubble chart in combination with bar alignment is not very useful." (link)

Dailymail-image-a-35_1431545452562

One can't argue with that statement. This chart fails the self-sufficiency test: anyone reading the chart is reading the data printed on the right column, and does not gain anything from the visual elements (thus, the visual representation is not self-sufficient). As a quick check, the size of the risk for "motorcycle" should be about 30 times larger than that of "car"; the size of the risk for "car" should be 100 times larger than that of "airplane". The risk of riding motorcycles then is roughly 3,000 times that of flying in an airplane. 

The chart does not appear to be sized properly as a bubble chart:

Dailymail_travelrisk_bubble

You'll notice that the visible proportion of the "car" bubble is much larger than that of the "motorcycle" bubble, which is one part of the problem.

Nor is it sized as a bar chart:

Dailymail_travelrisk_bar

As a bar chart, both the widths and the heights of the bars vary; and the last row presents a further challenge as the bubble for the airplane does not touch the baseline.

***

Besides the Visual, the Data issues are also quite hard. This is how Aleksander describes it: "as a reader I don't want to calculate all my travel distances and then do more math to compare different ways of traveling."

The reader wants to make smarter decisions about travel based on the data provided here. Aleksandr proposes one such problem:

In terms of probability it is also easier to understand: "I am sitting in my car in strong traffic. At the end in 1 hour I will make only 10 miles so what's the probability that I will die? Is it higher or lower than 1 hour in Amtrak train?"

The underlying choice is between driving and taking Amtrak for a particular trip. This comparison is relevant because those two modes of transport are substitutes for this trip. 

One Data issue with the chart is that riding a motorcycle and flying in a plane are rarely substitutes. 

***

A way out is to do the math on behalf of your reader. The metric of deaths per 1 billion passenger-miles is not intuitive for a casual reader. A more relevant question is what's the chance of dying from the time I spend per year of driving (or riding a plane). Because the chance will be very tiny, it is easier to express the risk as the number of years of travel before I expect to see one death.

Let's assume someone drives 300 days per year, and 100 miles per day so that each year, this driver contributes 30,000 passenger-miles to the U.S. total (which is 3.2 trillion). We convert 7.3 deaths per 1 billion passenger-miles to 1 death per 137 million passenger-miles. Since this driver does 30K per year, it will take (137 million / 30K) = about 4,500 years to see one death on average. This calculation assumes that the driver drives alone. It's straightforward to adjust the estimate if the average occupancy is higher than 1. 

Now, let's consider someone who flies once a month (one outbound trip plus one return trip). We assume that each plane takes on average 100 passengers (including our protagonist), and each trip covers on average 1,000 miles. Then each of these flights contributes 100,000 passenger-miles. In a year, the 24 trips contribute 2.4 million passenger-miles. The risk of flying is listed at 0.07 deaths per 1 billion, which we convert to 1 death per 14 billion passenger-miles. On this flight schedule, it will take (14 billion / 2.4 million) = almost 6,000 years to see one death on average.

For the average person on those travel schedules, there is nothing to worry about. 

***

Comparing driving and flying is only valid for those trips in which you have a choice. So a proper comparison requires breaking down the average risks into components (e.g. focusing on shorter trips). 

The above calculation also suggests that the risk is not evenly spread out throughout the population, despite the use of an overall average. A trucker who is on the road every work day is clearly subject to higher risk than an occasional driver who makes a few trips on rental cars each year.

There is a further important point to note about flight risk, due to MIT professor Arnold Barnett. He has long criticized the use of deaths per billion passenger-miles as a risk metric for flights. (In Chapter 5 of Numbers Rule Your World (link), I explain some of Arnie's research on flight risk.) The problem is that almost all fatal crashes involving planes happen soon after take-off or not long before landing. 

 

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

The comments to this entry are closed.