Climate change and duelling charts

Jul 11, 2013

Abhinav asks me to check out his blog post on a chart on global warming (I prefer the term climate change) featured on Wonkblog. The chart is sourced to a report by the World Metereological Association (link to PDF).

Hello, start the axis at zero whenever you are using plotting columns. That's as fundamental as only plot proportions on a pie chart.

There is a reason why the designer didn't like to start the axis at zero. It is this (Abhinav helpfully made all these charts):

The trouble is that for this data set (on global average temperature), the area below 13 is completely useless. It's like plotting body temperature on a scale of 0 - 100 Celsius when all feasible values fall into a tight range, maybe 35-38 Celsius. I recount a similar situation that led to a college president saying something stupid in Chapter 1 of my new book, Numbersense. (Information on the book is here.)

So we understand the desire to get rid of the irrelevant white space. This is accomplished by using a line chart. (I'd prefer to omit the data values, and rely on the axis.)

Abhinav then created various versions of this by compressing and expanding the vertical scales. I don't think there is anything wrong with the above scale. As I mentioned, the scale should focus on the range of values that are feasible.

Nice work, Abhinav.

A separate issue is that the zero on temperature scales is arbitrary (unless you're using Kelvin). So there's really no utility in zero-indexing.

Ryan, there is a hidden connection between the two issues, though, which is that temperature has an arbitrary zero because, like time, it naturally falls into a role as an interval quantity. No one setting out to create a scale of length or mass would have given them a zero at any other place than actual zero.

Strange how no one plotting a graph of years feels the need to show every year since 1 AD, unless the timescale really is that long.

I think one of the easiest ways to determine a valid range for a dataset is to use history as context.

What happens if the last 1000 years range was used as the temperature range for the last 100 years? It should get you to a good magnitude.

In addition (also, if history isn't available), use a range with relevant points for resulting or correlated events/conditions. To use the body temperature example - mark the point of sickness, unconsiousness, death for varying body temperatures - those points will stear you to a good range.

As derek points out, if you are on an interval scale (rather than a ratio scale) you have no true zero, so starting at zero doesn't make sense.

Consider, for example, IQ (normed to a mean of 100, standard deviation of 15). [or SAT scores, which have similar norming] In these cases, starting the y axis at 0 is deceptive rather than a "best practice".

Meant to include this: that's why bar graphs should only be used for ratio scales, not interval scales.

I know the convention but it seems pretty arbitrary to force bar charts to have a zero point but not also line charts.

@Mike -
It may seem it at first glance, but it is not arbitrary at all.

The entire point is that a bar chart display data specifically by its length, which represents the entirety of the data that makes up that point - therefore requiring a 0 base to accurately show the measure that it represents.

Showing two bars side by side, where the first bar is twice as long as the other, tells the viewer that the first measure is twice as much as the other...

If you aren't starting from 0, then this impression is wrong, and the chart misleading.

So, if you are making a chart where a 0 base in not appropriate, then neither is a bar chart.

Because I grew up with the Celsius scale, I do associate significance with zero degrees. It's the freezing point. But this discussion is illuminating. For many quantities, the relevant reference range is the historical variance-which furthers the point that a bar chart is inappropriate whether it starts at zero or not.

The reason why it does not start at zero is to simply exaggerate.
When most look at the graph, they see the large distance between the tops of the bars, and use that as their scale, which is why it is majorly misleading. Go to the following link if you want to see the actual history of earths temperature over 800,000 years: