Andrew Gelman has put up a great post, discussing how he collaborated with the New York Times editors to transform his chart to a publishable form.  Some time ago, there was a discussion here about the publish-worthiness of my charts (actually, the lack thereof), and I explained that to take my charts to that level would require quite a bit of extra work.  Gelman speaks here from first-hand experience.

Here are the two versions side-by-side.  Readers on the Gelman blog are debating which version is "better" with Gelman himself voting for the revised NYT version:


Let's compile a list of the changes:

  1. Removed 2000 data and relied on only 2004 data
  2. Reduced number of groups of senators from 7 to 5 (with a special calling out of Senator Lieberman)
  3. Re-ordered senators to facilitate classifying into Democrats, Independent, Republicans
  4. Added annotation explaining the grouping of senators; in particular, qualified why the 49 Democrats and 32 Republicans were set aside
  5. Removed scale labels on the top of the chart, retaining only the bottom labels
  6. Vastly increased the area devoted to text labels, which now covered half of the chart
  7. Added white vertical gridlines
  8. Instead of coloring the cross marks, colored the background of the chart; introduced the yellow color for Lieberman (which Gelman originally colored blue)
  9. Removed reference to the national average but explained clearly in the legend that the percentages are relative to the national average.
Of these, 2,3,4 are definite improvements. 1 is acceptable since 2000 and 2004 data are quite highly correlated (note, however, Iowa, Virginia and Ohio.) 7 looks brilliantly here although we typically don't recommend adding gridlines and such. 5 is also acceptable because of 7 -- with the gridlines, one doesn't need to have two sets of labels. 8 can be debated: I'd like to see a version with large colored squares instead of cross marks in lieu of the background; the current look is not bad. 9 can also be mulled over: would it be so bad as to insert in parentheses 73% after the words "national average" in the legend?  I suspect some feel that this may cause confusion because readers now need to understand relative versus absolute values. 6 is the most debatable because the data-ink ratio has been vastly reduced. 

There are several good features of the original chart that they left unaltered, which should also be duly noted:

  • Allowed the positive side of the scale to extend to 10% even though the largest data only reached about 6%. While this creates empty space, it helps readers judge the magnitude of negative numbers relative to positive ones.
  • Subdued horizontal gridlines
  • Retained the title of the chart (I actually prefer that they use the title to summarize the finding as opposed to the current one which takes a neutral stance.)
  • Retained the horizontal scale labels at 5% apart
One final comment on something omitted from both charts: I'd prefer that they plot all the dots for the 49 + 32 senators, on the top and bottom lines respectively, instead of just plotting the group averages.  It would be really interesting to see what the spread is around the 0% point.


Andrew Gelman

Good point on plotting all the dots. If we'd thought of that, we would've done it. I was pretty proud of my original idea which was to combine many Dems as one point and many Reps as another, thus showing 100 senators without having to make a graph with 100 lines (that's how the original version looked).


Andrew: I suspect that your original idea would work better than my suggestion for the mass audience. Statisticians would probably want to see all the data - and dare I say it, how about replacing the averages with boxplots?

