Further views of unemployment
Aug 24, 2010
Instead of looking at unemployment rates across the 50 states plus D.C., we can look at patterns of the ranking of the states instead. Such rankings are most effectively visualized as multi-period bumps charts.
Funny thing... the variations in rankings over time are very severe! So much so that if all 50 states are plotted on the same chart, we get a complete entangled mess.
Here, I restricted the time period to 2000 onwards, and only the January unemployment rate for each year is plotted. Otherwise, the mess gets messier still.
But don't give up! The value of such a chart instantly appreciates just by adding a color, as in the following to set aside the Western states against the rest of the union:
In these charts, the worst ranks (higher unemployment) are placed higher. We see that Utah has climbed down the rankings during the recession, indicating that its employment situation has improved relative to other states. On the other hand, California has been a laggard pretty much the entire decade -- while its current rank is bad, it isn't that much worse than earlier in the decade.
It doesn't really matter which chart type one uses; it is a certainty that the designer must make choices as to which data to expose. Instead of plotting every state, here is a manageable chart that takes 10 randomly chosen states, comparing the trajectories of their unemployment rankings in the last decade:
What do we see here? Little North Dakota has been a star throughout most of this decade. Michigan has rapidly declined and is lingering at the back of the pack for three straight years. Florida experienced big ups and downs, with Alabama following a very similar trajectory. Poor Mississippi has been behind throughout the decade.
I love it when I write a post, and the chart designer pops in and provides his/her point of view. That's one of the things that keep me going. Appreciate the very substantive comments from my last post, and will respond soon with further comments. Thanks for reading!
I'm curious: what tool did you use to create these graphs?
Posted by: Tom Hopper | Aug 24, 2010 at 11:10 AM
Since ranks are so variable, unlike the underlying data, why use them?
Posted by: Hadley Wickham | Aug 24, 2010 at 12:03 PM
Completely agree with Hadley; ranks may show differences and/or changes where actually almost nothing has happened, whereas the actual values would emphasize the "significant" changes.
My standard question applies: Can we get the data somewhere easily?
Posted by: Martin | Aug 24, 2010 at 01:14 PM
I second Tom Hopper: What tool did you use?
Posted by: kerokan | Aug 24, 2010 at 05:15 PM
There's no scale!
Posted by: Rosie Redfield | Aug 24, 2010 at 05:30 PM
Agreed that ranks are often not useful here as they distort a continuous measure, enforcing a constant distance on the axis between states even when the actual difference in the value may be negligible.
e.g. this figure seems to show a precipitate drop in infant birth survival in the US:
But what actually happened was that mortality rates improved slightly (from an already very low value) in the US but much more elsewhere over the same period (e.g. Singapore). The rankings on infant mortality in developed countries are pretty meaningless, as all the actual values are close to zero. I showed this NY Times figure in a seminar once, along with a GapMinder chart showing the actual data. Rather than some Western states getting worse, as implied by the NY Times chart, what actually happened was that many nations converged over time to their similar, very low, mortality levels.
So rankings can introduce substantial apparent variability to a dataset when none actually exists. Similarly, they can hide significant differences, as the gap between 1st and 2nd may be huge, while the gap between 21st and 22nd, say, may be negligible.
Posted by: Michael MacAskill | Aug 24, 2010 at 09:23 PM
Tom and Kerokan: The magic of R, which allows you to control pretty much every aspect of the chart.
Martin: the data is publicly available from the BLS website (Bureau of Labor Statistics)
Posted by: Kaiser | Aug 24, 2010 at 10:49 PM
Hadley, Michael, others: I'd encourage you to take a look at the underlying data before throwing out ranks as a measure. Ranks are particularly interesting in the context of the 50 states since the data is structured for such analysis. It is very natural for one state to compare itself against some other state. To me, the relevant question is not "why use ranks?" but "why haven't one looked at ranks?"
Ranks and rates measure fundamentally different things. Ranks measure a state's performance relative to other states while rates measure the absolute performance for each state.
For instance, I have plotted here the rank and rate changes over time for North Dakota. In 2000, ND experienced its lowest unemployment in the decade and yet relative to other states, ND had its worst rank. By late 2000s, ND experienced the worst unemployment the state has seen in a decade but relative to other states, it has weathered the recession well and is now ranked first. If we only looked at rates, this information is well hidden.
That said, Michael's point is well taken, and thanks for the thoughtful comment. Ranks often obscure information, especially if the differences are immaterial. But there is a large literature on using ranks, and that's because in some cases, they are illuminating.
Posted by: Kaiser | Aug 24, 2010 at 11:40 PM
Hi Kaiser, perhaps we can converge on a consensus...
As you say, ranks can indeed be a valuable way of assessing performance. But the impression they give can be distorting, so a good guideline would be to generally accompany such a figure with one showing the actual values as well. This depends a little bit on the data being analysed: in child mortality, countries converge on a very low value, so ranks become meaningless. In unemployment, values rise and fall over time for each state and so rankings can be more meaningful. But one can't be sure without seeing the data from both perspectives, e.g. the difference between 40th and 50th ranking could be just 0.1%, but that between 1st and 11th could be 2%
The North Dakota example nicely shows that the picture each gives can be wildly different. What I think would be ideal is to produce a figure containing all 50 states, with actual unemployment values. With both forms of the figure available to compare and contrast, then the conversation we'd be having would be a more interesting one about the actual data : )
Posted by: Michael MacAskill | Aug 25, 2010 at 05:03 PM
Small multiples are another approach, somewhat different from the maps of the previous post or the rankings here.
This example shows each state series, national series, and as gray background all other states, in each small multiple.
States are sorted from lowest unemployment to highest, so relative ranking is also apparent.
I think the problem with such small multiples is the cognitive effort required for a casual reader to decode the information, though anything that presents 50 states over time is going to have to cope with that as well.
Posted by: Charles Franklin | Aug 29, 2010 at 07:22 PM