Gelman pointed to this Brendan Nyhan post dissecting David Sirota's chart purportedly showing a "race chasm" in the Democratic primaries. The left chart is David's original and the right is a Nyhan revision.
Please see Nyhan for the political interpretation. Here, I want to note a number of improvements Brendan made to the chart:
- Sirota plotted the ranks of the percent of black population, which is misleading. Nyhan plotted the actual percentages on his horizontal axis
- Sirota connected the dots which highlighted the noise (ups and downs) in the data. Nyhan fitted a linear model (he also tried other non-linear versions).
- Sirota plotted Obama's overall margin of win/loss. Nyhan plotted his margin among white voters only, which more directly addressed the issue.
- Nyhan exposed the excluded states in a footnote. Sirota didn't. For this chart, this piece of information is very important since so many states were excluded.
Nyhan walked us through multiple charts he used to explore the data. Much of the time was spent picking and choosing states to include or exclude. We learnt that Sirota excluded states with large Hispanic populations, which Nyhan disagreed with while Nyhan wanted to exclude Florida, which Sirota decided against, even though Sirota excluded Michigan, which Nyhan consented but Nyhan also wanted to exclude the causus states, and so on...
Judging from the charts, this picking and choosing appears not to have changed the outcome in this case. In general, one should exercise great care in such decisions because one might end up seeing what one wants to see.
The following chart is missing from the post, which I think points out something more telling than the negative correlation between Obama's margin with white voters and the proportion of black population.