Note: If you are here to read about Google Flu Trends, please see this roundup of the coverage. My blog is organized into two sections: the section you are on is about data visualization; the other section concerns Big Data and use of statistical thinking in daily life--click to go there. Or, you can follow me on Twitter which combines both feeds.
Because the visual medium is powerful, it is a favorite of advocates. Creating a chart for advocacy is tricky. One must strike the proper balance between education and messaging. The chart needs to present the policy position strongly and also enlighten the unconverted with useful information.
In my interview with MathBabe Cathy O'Neil (link), she points to this graphic by Pew that illustrates where death-penalty executions have been administered in the past two decades in the U.S. (link) Here is a screenshot of the geographic distribution for 2006:
The chart is a variant of the CDC map of obesity, which I discussed years ago. At one level, the structure of the data is the same. Each state is evaluated on a particular metric (proportion obese, and number of executions) once a year. Both designers choose to roll through a sequence of small-multiple maps.
The key distinction is that the obesity map encodes the data in color while the executions map encodes data in the density of semi-transparent, overlapping dots, each dot representing a single execution.
Perhaps the idea is to combat one of the weaknesses of color encoding: humans don't have an instinctive sense of the mapping between a numerical scale and a color scale. If the color transitions from yellow to orange, how many more executions would that map to? By contrast, if you see 200 dots instead of 160, we know the difference is 40.
The switch to the dots aesthetic introduces a host of problems.
Density, as you recall from geometry class, is the count divided by the area. High density can be due to a lot of executions or a very small area. Look at Delaware (DE) versus Georgia (GA). The density of red appears similar but there have been far fewer executions in Delaware.
This is a serious mistake. By using dot density, the designer encourages readers to think in terms of area of each state but why should the number of executions be related to area? As Cathy pointed out, a more relevant reference point is the population of each state. An even cleverer reference point might be the number of criminals/convictions in each state.
Another design issue relates to the note at the bottom of the chart (shown on the right). Here, the designer is fighting against the reader's knowledge in his/her head. It is natural for a dot on a map to represent location and yet the spatial distribution of the dots here provide no information. Credit the designer for clarifying this in a footnote; but also let this be a warning that there are other visual representation that does not require such disclaimers.
I am confused by why dots appear but never disappear. It seems that the chart is plotting cumulative counts of executions from 1977, rather than the number of executions in each year, as the chart title suggests. (If you go to the Pew website, you find a version with "cumulative" in the title; when they produced the animated gif, they decided to simplify the title, which is a poor decision.)
It requires a quick visit to Wikipedia to learn that there was a break in executions in the 70s. This is a missed opportunity to educate readers about the context of this data. Similarly, a good chart presenting this data should distinguish between states that have banned the death penalty and states that have zero or low numbers of executions.
A great way to visualize this data is via a heatmap. Here, I whipped up a quick sketch (pardon the sideway text on the legend):
I forgot to add the footnote listing the states where the death penalty is banned. Also can add an axis labeling to the side histogram showing counts.