Different pictures of unemployment
Book review: Interactive Graphics for Data Analysis

Further views of unemployment

Instead of looking at unemployment rates across the 50 states plus D.C., we can look at patterns of the ranking of the states instead. Such rankings are most effectively visualized as multi-period bumps charts.

Funny thing... the variations in rankings over time are very severe! So much so that if all 50 states are plotted on the same chart, we get a complete entangled mess.

Jc_unemploy_all_bumps

Here, I restricted the time period to 2000 onwards, and only the January unemployment rate for each year is plotted. Otherwise, the mess gets messier still.

But don't give up! The value of such a chart instantly appreciates just by adding a color, as in the following to set aside the Western states against the rest of the union:

Jc_unemploy_west_bumps

In these charts, the worst ranks (higher unemployment) are placed higher. We see that Utah has climbed down the rankings during the recession, indicating that its employment situation has improved relative to other states. On the other hand, California has been a laggard pretty much the entire decade -- while its current rank is bad, it isn't that much worse than earlier in the decade.

It doesn't really matter which chart type one uses; it is a certainty that the designer must make choices as to which data to expose. Instead of plotting every state, here is a manageable chart that takes 10 randomly chosen states, comparing the trajectories of their unemployment rankings in the last decade:

Jc_unemploy_10rand_bumps

What do we see here? Little North Dakota has been a star throughout most of this decade. Michigan has rapidly declined and is lingering at the back of the pack for three straight years. Florida experienced big ups and downs, with Alabama following a very similar trajectory. Poor Mississippi has been behind throughout the decade.

***

I love it when I write a post, and the chart designer pops in and provides his/her point of view. That's one of the things that keep me going. Appreciate the very substantive comments from my last post, and will respond soon with further comments. Thanks for reading!

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Tom Hopper

I'm curious: what tool did you use to create these graphs?

Hadley Wickham

Since ranks are so variable, unlike the underlying data, why use them?

Martin

Completely agree with Hadley; ranks may show differences and/or changes where actually almost nothing has happened, whereas the actual values would emphasize the "significant" changes.

My standard question applies: Can we get the data somewhere easily?

kerokan

I second Tom Hopper: What tool did you use?

Rosie Redfield

There's no scale!

Michael MacAskill

Agreed that ranks are often not useful here as they distort a continuous measure, enforcing a constant distance on the axis between states even when the actual difference in the value may be negligible.

e.g. this figure seems to show a precipitate drop in infant birth survival in the US:
http://www.nytimes.com/imagepages/2008/10/16/health/20081016_INFANT_GRAPHIC.html
But what actually happened was that mortality rates improved slightly (from an already very low value) in the US but much more elsewhere over the same period (e.g. Singapore). The rankings on infant mortality in developed countries are pretty meaningless, as all the actual values are close to zero. I showed this NY Times figure in a seminar once, along with a GapMinder chart showing the actual data. Rather than some Western states getting worse, as implied by the NY Times chart, what actually happened was that many nations converged over time to their similar, very low, mortality levels.

So rankings can introduce substantial apparent variability to a dataset when none actually exists. Similarly, they can hide significant differences, as the gap between 1st and 2nd may be huge, while the gap between 21st and 22nd, say, may be negligible.

Kaiser

Tom and Kerokan: The magic of R, which allows you to control pretty much every aspect of the chart.

Martin: the data is publicly available from the BLS website (Bureau of Labor Statistics)

Kaiser

Hadley, Michael, others: I'd encourage you to take a look at the underlying data before throwing out ranks as a measure. Ranks are particularly interesting in the context of the 50 states since the data is structured for such analysis. It is very natural for one state to compare itself against some other state. To me, the relevant question is not "why use ranks?" but "why haven't one looked at ranks?"

Ranks and rates measure fundamentally different things. Ranks measure a state's performance relative to other states while rates measure the absolute performance for each state.

For instance, I have plotted here the rank and rate changes over time for North Dakota. In 2000, ND experienced its lowest unemployment in the decade and yet relative to other states, ND had its worst rank. By late 2000s, ND experienced the worst unemployment the state has seen in a decade but relative to other states, it has weathered the recession well and is now ranked first. If we only looked at rates, this information is well hidden.

That said, Michael's point is well taken, and thanks for the thoughtful comment. Ranks often obscure information, especially if the differences are immaterial. But there is a large literature on using ranks, and that's because in some cases, they are illuminating.

Michael MacAskill

Hi Kaiser, perhaps we can converge on a consensus...
As you say, ranks can indeed be a valuable way of assessing performance. But the impression they give can be distorting, so a good guideline would be to generally accompany such a figure with one showing the actual values as well. This depends a little bit on the data being analysed: in child mortality, countries converge on a very low value, so ranks become meaningless. In unemployment, values rise and fall over time for each state and so rankings can be more meaningful. But one can't be sure without seeing the data from both perspectives, e.g. the difference between 40th and 50th ranking could be just 0.1%, but that between 1st and 11th could be 2%

The North Dakota example nicely shows that the picture each gives can be wildly different. What I think would be ideal is to produce a figure containing all 50 states, with actual unemployment values. With both forms of the figure available to compare and contrast, then the conversation we'd be having would be a more interesting one about the actual data : )

Charles Franklin

Small multiples are another approach, somewhat different from the maps of the previous post or the rankings here.

http://pollsandvotes.com/PaV/?p=95

This example shows each state series, national series, and as gray background all other states, in each small multiple.

States are sorted from lowest unemployment to highest, so relative ranking is also apparent.

I think the problem with such small multiples is the cognitive effort required for a casual reader to decode the information, though anything that presents 50 states over time is going to have to cope with that as well.


Charles
[email protected]

The comments to this entry are closed.