An overused chart, why it fails, and how to fix it
Apr 17, 2014
Reader and tipster Chris P. found this "death spiral" chart dizzying (link).
It's one of those charts that has conceptual appeal but does not do the data justice. As the name implies, the designer has a strong message, that the arctic sea ice volume has dramatically declined over time. This message is there in the chart but the reader has to work hard to find it.
Why doesn't this spider chart work? We can be more precise.
- A big problem is the lack of scalability. This chart looks different every year. If you add an extra year to the chart, you either have to increase the density of the years or you have to drop the earliest year.
- Years are not circular or periodic so the metaphor doesn't quite work.
- This chart type requires way too many gridlines.
- Axis labeling is also awkward. Because of the polar coordinates, the axes are radiating so the numbers run up toward the top but run down toward the bottom.
- This specific instance of spider chart benefits from the well-behaved data: the between-year variability is much lower than the within-year variability. As a result, the lines don't cross each other much. If the variability from year to year fluctuates a lot, we would have seen a bunch of noodles.
This is a pity because the designer did very well in aligning two corners of the Trifecta Checkup, namely what is the question and what does the data show? It is a great idea to control for month of year, and look at year to year changes. (A more typical view would be to look at month to month changes and plot one line per year.)
This is an example of a chart that does well on one side of the checkup but the failure is that the graph isn't in tune with the data or the question being addressed.
Whenever I see a spider chart, I want to unroll the spiral and see if a line chart is better. Thus:
The dramatic decrease in Arctic ice volume (no matter the month) is clear as day. You can actually read off the magnitude of the drop. (Try doing that in the spider chart, say between 1978 and 1995.)
This chart still has issues, namely too many colors. One can color the lines by season of the year, like this:
Or switch to a small-multiples set up with three lines per chart and one chart per season.
The seasonal arrangement is not arbitrary. You can see the effect of season by looking at side by side boxplots:
The pattern is UP-DOWN-DOWN-UP.
In fact, a side-by-side boxplot of the data provides a very informative look:
The monthly series is obscured in this view, built into the vertical variability, which we can see is quite stable. The idea of controlling for month is to make it irrelevant. This view emphasizes the year on year decline of the entire distribution.
If you're worried that dropping too much information, the data can be grouped by season as before in a small-multiples setup like this:
Regardless of season, the trend is down.
PS. Alberto reminds me of his post about one example of a spider chart (radar chart) that works. Here's the link. It works because the graphical element is more in tune with the data. While the ice cap data has a linear trend over time, the voting data is all about differences in distribution. Also, the designer is expecting readers to care about the high-level pattern, not about the specifics.
At first glance, I thought I was looking at an overhead view of the North (or South) Pole, which I think actually could be an interesting way to drop a variation of this data on a map. If each of the lines represented a year, you could see how much the ice cap was receding over the years. Although, I don't know how many feet/miles that would represent and if the difference would be noticeable on an actual map, but if so - much more interesting/valuable.
Sticking to the data actually used/represented by the spider chart, any of your re-renderings are vast improvements!
Posted by: Andy Chandarana | Apr 17, 2014 at 12:36 PM
Of course a regular line plot is much, much better. I would choose to plot only April and September, the months with largest and smallest ice volume. The fluctuation within a year is not as relevant, and it seems quite consistent from year to year.
One criticism: why is there no units on your vertical axis? That's graphing 101! ;)
It seems also you mixed up the months when grouping them by season. A minor detail, but could cause confusion if someone reposts your graphs.
Posted by: Cris | Apr 17, 2014 at 01:59 PM
Cris: For season, I just did months 1-3 = Spring, 4-6 = Summer, etc. Can you be more specific? What is being mixed up?
Speaking of which, the raw data came with days numbered from 1 to 365. Does anyone know how leap years are handled?
Posted by: Kaiser | Apr 17, 2014 at 03:30 PM
Would you recommend using a seasonally adjusted seriess or no? They're not terribly complicated to create; do they add too much of a sophisticated statistical feel that turns off more casual readers (in the same way that some people object to seasonally adjusted employment numbers)?
I've seen the multi series style used many a time, and it always seems cluttered to me. And while I'm a fan of boxplots, they don't reveal the underlying pattern within the year - all the lowest points in the second boxplot chart correspond to summer and all the highest points to winter, but in that single display, that gets lost.
Posted by: Adam Schwartz | Apr 21, 2014 at 09:29 PM
Adam: In a way, the charts above perform a similar function to seasonal adjustment. Applying a formula makes it more precise. I totally support seasonal adjustment (I have a whole chapter on employment statistics in Numbersense!) The objection to seasonal adjustment stems from not understanding it.
As for boxplot versus line chart, Andrew Gelman if he sees this will say have both and make it clickable between the two.
In this dataset, it's clear to me the signal is in the year-to-year changes, and the seaonal month to month pattern is just noise. I like to keep readers focused on the key message. If you insist, I'd create a second chart that has the average month to month seasonality to convey that information (average the years so as to focus the second chart on seasonality).
Posted by: Kaiser | Apr 22, 2014 at 12:00 AM
Posting this comment from Robert Simmon who had trouble with the commenting system:
Mapping Arctic Sea Ice over time does't work to show the trend because the shape of the ice cap changes, as well as the size: http://earthobservatory.nasa.gov/Features/WorldOfChange/sea_ice.php
In addition to a seasonal cycle plot the Earth Observatory (the NASA site I design for) has been comparing the most recent few years' data with the long-term mean plus standard deviations:
This gives both a sense of the normal seasonal changes, and how unusual the last few years were.
Typically leap years run from 1 to 366 in a day of year system. So March 1 is usually 060, but it's 061 on leap years. However, the sea ice extent data from NSIDC—which should be the official data—uses YYYY MM DD i.e.:
Year, Month, Day, Extent, Missing, Source Data
YYYY, MM, DD, 10^6 sq km, 10^6 sq km
1978, 10, 26, 10.19591, 0.00000
1978, 10, 28, 10.34363, 0.00000
1978, 10, 30, 10.46621, 0.00000
1978, 11, 01, 10.65538, 0.00000,
Posted by: Robert Simmon (posted by Kaiser) | Apr 22, 2014 at 12:10 AM
Posting this comment for Xan Gregg who also had trouble with the commenting system:
Very nice. I think radar charts are worse than most realize. For instance, a dip in a cartesian chart can turn into a straight line in a radar chart.
As I commented on Alberto's post that you linked to, the paper Graphical Tests for Power Comparison of Competing Designs (http://users.soe.ucsc.edu/~pang/visweek/2012/infovis/papers/hofmann.pdf) by Heike Hofmann et al. may be of interest. One of their test cases shows that circular charts underperform even when the data is naturally circular (wind direction).
Also: Rob Simmon's twitter poll on similar data: https://twitter.com/rsimmon/status/380347039382388736
And a try I made then: https://twitter.com/xanjmp/status/380708097590697984
Posted by: Xan Gregg (posted by Kaiser) | Apr 22, 2014 at 12:12 AM