« Some links | Main | Further views of unemployment »



I like your "trifecta checkup" as a quick way to see if a chart meets some basic standards, but I'm not sure if it goes far enough. A designer of data graphics needs to consider how the different elements of the chart blend together to create a “big picture.”

As you point out, a chart designer must choose which elements to highlight, which elements to keep in the background, and which elements to exclude altogether. In some cases “background” information can scream into the foreground of the chart, obscuring the more important information the data (and your chart) are trying to convey. In your third graph I think you create a lot of noise that masks what the data are saying and hides the practical question.

First, you can increase the actual information available to the chart reader. Instead of an asterisk or a line or a mark, it looks like there is enough space for the whole number. If your focus is on the median and current unemployment rate, could you actually plot those numbers on the graph as if they were the markers?

Second, I think some background elements overshadow the more important information. The box-and-whisker plot can get very muddy quickly: lots of heavy black lines and dots distract from the real focus of the chart. But it’s pretty easy to fix this (and, as Tufte would say, “maximize the data-ink”) while still providing the background information. Start by getting rid of the box’s outline: a gray bar is sufficient visual. Similarly, the black dotted lines for Q1 and Q4 are heavy. Even though each only represents a small quartile, it overpowers the more important data-point: the current unemployment rate. A thin gray line accomplishes the same task without being so loud. Likewise, the little o’s for outlier observations almost overpower the whole thing; use a very light single dot for outlier observations instead of the heavy black “o.”

Finally, it seems you're saying the practical question is, “how much higher is the current unemployment rate above the median unemployment rate?” So why did you choose to sort on the median only? It might make more sense to sort on the difference between median and current unemployment rate?

I hope you don’t mind me making a few comments. I’ve only just come across your blog, but I think I’ll be adding it to my regulars. Thanks!

Chris Wilson

Hi Kaiser, Chris Wilson here--I designed the Slate map you discuss in this thoughtful post. Naturally, I disagree on a few points:

-While charts must first and foremost be clear, I believe they must also be visually arresting if one hopes to engage a viewer. While the two examples you include may fit your criteria better than my map, they are also boring as sin. This is particularly important when the chart is in the service of journalism and published in a general interest magazine.

-Charts must convey the scale of the data they represent. Since humans are not trained to think logarithmically, it is very difficult to grasp the magnitude of a problem. Was it 100,000 barrels spilled in the Gulf or 100 million or 100 billion? This map clearly depicts the extraordinary scope and breadth of the recession in a way that the other two do not.

-I agree that visualizing raw numbers of people geographically often mimics a population density map, and if time were of no object I would include numbers proportional to the population as well. But in this case, the raw numbers are quite telling. First, the pure number of people without jobs in a location is significant and important. Second, you do not in fact see the red affect the country uniformly. Poor Detroit is losing jobs from the first slide while the rest of the country is still flush. As early as March of 2008 you see lost jobs along the western coast of Florida, which has a lot to do with the second-home industry.

-While more information is by no means always better, this map has over 100,000 data points compared to a far more meager fare in yours. This means any user can get information for his or her own county. (I'll be the first to admit I could use a search and zoom function.) This shows both local and national trends in a compelling way.

I love Junk Charts and agree with most of what you write on the blog, but I'll stack my chart up against your examples any day.


jeff weir

Hi Chris. I agree that a chart must be visually arresting if one hopes to engage a viewer. It should also strive to tell the best story that the data can.

I think your map does a better job of engagement. But I think the calculated risk blog map does better on the story front...purely because it used unemployment rate. This means we get a meaningful answer to the question "compared to what.

Perhaps a good option would be a button that lets you toggle between unemployment, and unemployment rate.

I'm not a fan of maps...I wrote up a guest post on such maps at http://chandoo.org/wp/2009/07/24/medicare-chart-critique/ if you're interested. But I do prefer your map over Kaiser's boxplot given the audience isn't statisticians. Although your map does take up a lot of space. So I'd be inclined to print a smaller map, and include a 2nd chart that ranks the entire US values from smallest to largest, and highlights where the currently selected state falls on that map. I'd keep the headline numbers of people who have lost their jobs nationwide. And I'd color code your legend so that it matches the colors on the map.


Interesting discussion here. I would tend to side with the map faction, given the audience but I also agree that the unemployment rate is more interesting than the absolute numbers. I think a heatmap of the unemployment rate or the percentage change in the employment rate would do the best job to accomodate both.


Chris: Thanks for the generous remarks. As always, I am a great admirer of all the work that is being done out there. I know how much time and effort goes into each of these creations. That's one of the reasons why the re-made charts never look very good because I can't spend that amount of time on each post!

Some of the posts here, like this one, are presented to provoke thinking about charts. Sometimes it's clear that a different presentation would be clearer or more engaging but sometimes I just want to present alternative ideas. I certainly do not intend to suggest that you should have published the third chart.

The issue of visual engagement versus message clarity has often been raised on this blog, and I even got a question about this at the Ed Lab seminar recently. A summary of my thoughts are here.

All of us would agree that engagement plus clarity is the holy grail but too often it seems like they clash. In fact, the more data one stuffs into a chart, the more likely it becomes less clear, but the more likely it becomes more engaging.

I believe there is a way out of this knot, and the key is a more relaxed interpretation of Tufte's data ink ratio:

In a box plot, the numbers being plotted are medians, maxima, minima, etc. These statistics are computed from thousands of data points--the chart could not be created without having processed all that data. So even though only a small number of statistics show up in the plot, a huge amount of data lies just beneath the surface. With fewer things on the chart, it has a good chance to be clear, so now if one can make it also engaging, one will hit the bull's eye.

dan l

I have a question about it though. I am, after all, a chart n00b.

To me, you'd use this chart if you're trying to show regional trends/patterns in the data. That is, after all, the only reason you'd ever use a map.

Those patterns/trends could just as easily be illustrated with the data summed by state. So the county data becomes fog - unnecessary detail - that just muddles what you can take from the chart.

Manoel Galdino

I just found this blog today. I loved it.
Noew, one question: What software did you use to produce the box-plot chart? R? Any tip about how I can do it myself? Packages?

thanks agains for the post and blog. It is already in my google reader.


很实用 但是全英文 不是很明白


Manoel: I used R to generate that boxplot. Unfortunately many software packages don't do boxplots (e.g. Excel). Overlaying the blue dots is a separate step. But everything else is pretty standard. Well, except ordering the countries requires a bit of work.

Wong: If you email me, I can try to explain more to you. Too bad I don't know how to type in Chinese, otherwise I can translate the parts that are causing problems.

The comments to this entry are closed.


Link to Principal Analytics Prep

See our curriculum, instructors. Apply.
Kaiser Fung. Business analytics and data visualization expert. Author and Speaker.
Visit my website. Follow my Twitter. See my articles at Daily Beast, 538, HBR.

See my Youtube and Flickr.

Book Blog

Link to junkcharts

Graphics design by Amanda Lee

The Read

Keep in Touch

follow me on Twitter