Still shaken from the quake
The Earth Institute needs a graphics advisor

The importance of explaining your chart: the case of the red 118

Reader Jim S. was rightfully mystified by the following map that appeared on the Ars Technica blog (link), and purported to demonstrate that high temperatures of March 2012 across most of the U.S. were of historical significance.


I must say the production values of this map, produced by the people at NOAA, are superb. I love, love, love the caption that the Ars Technica editors added to the map. I wish they had blown it up to 20-point font, and made it shiny :) Besides that, the colors are well-chosen, and it doesn't feel cluttered despite having 48 numbers printed on it.

Like Jim, I'm hypnotized by the drumbeat of 118, 118, 118, ... all over the red area. Noaa_map_legendWhat could the numbers mean? They could be temperatures in Fahrenheit (although 118 degrees in March surely would have been newsworthy). The legend does lend support to this interpretation (see right), what with the extra-large font announcing "Temperature". Jim commented: "But it seems odd that such a large area would have precisely the same high."


201203-201203Not so soon, Jim. The NOAA also made the chart shown on the right (link). So indeed, the entire country could be given one value of 118.

If not Fahrenheit, what could the numbers mean? They could be some kind of index in which case the average value would seem to be 50 (the white patch). That would be one strange index.

Too bad this map is produced by specialists for specialists, leaving us commoners guessing. The only clue we got is in the title, "Statewide Ranks".

But this isn't very helpful either. The 118s are still ringing in my ear. If the numbers are ranks, then 118 would likely be the maximum rank, given as there are so many 118s. But I can't figure out which metric has 118 levels.

I finally found my way to this page, which explains what NOAA calls "climatological ranking". The page also has a chart (below), which can serve as a sort of legend for the maps, but is almost as difficult to read.

Ranks-combined-frameApparently there are 118 years worth of recorded temperatures, going back to 1895. And within each state, the annual temperatures for the past 118 years were ranked from lowest to highest, meaning that 118 is the hottest on record.

Given that there is lop-sided attention to hotter temperatures (global warming), it would be much better to reverse the ranking so that 1 is the hottest month year!

The chart also explains that the years are grouped into three equal buckets to indicate "below normal", "near normal" and "above normal".

Too bad this chart gives us three or five levels of ranking while in the map they use seven colors (levels).

They really ought to include on the map (a) the definition of the ranking and (b) the range of ranks corresponding to each color.


While researching this post, I found this wonderful page of NOAA maps (link). This is a beautiful illustration of the process of statistical aggregation. Notice the trade-off between simplicity and loss of information. The art in statistics is to figure out the right balance between the two.



I always like to explore doing away with the unofficial rule that says spatial data must be plotted on maps. Conceptually I'd like to see the following heatmap, where a concentration of red cells at the top of the chart would indicate extraordinarily hot temperatures across the states.


I couldn't make this chart because the NOAA website has this insane interface where I can only grab the rank for one state for one year one at a time. But you get the gist of the concept.


Did I tell you I love, love, love the caption? Go right ahead, and make a slogan for your chart today!


 [PS: Reader Mark Bulling (see his comment below) contributes a realization of my heatmap suggestion above. One of the benefits of this chart is its economy, as a small version of it shows:





I the 118 is for the number of years that records have been kept so 118 reflects the high for all of the years, and 1 would be the lowest for all the years, I am guessing. But of course you shouldn't have to guess.

Also good point about aggregation and the loss of information.


Thanks for the comment. I made me realize I said 1 is the hottest "month", by which I meant hottest "year" or year with the hottest March. I've fixed that now.

Mark Bulling

Nice post, I took your heatmap suggestion and scraped the data to do it. Full post and heatmap here:


Mark: Thanks so much for completing the redo. I've included it in my post above. For your second chart, I suggest that you standardize the temperature data by state first. Then you'd find a lot of red on the right.

John Holcomb

NOAA creates some of the worst graphics in science. It's rather depressing as they also have some of the richest datasets. Have you seen this abomination?


The comments to this entry are closed.