Today's post is inspired by the following chart (I am showing only the top of it - click here to see the entire chart):
The chart plots each state as a separate row, so like most such charts, it is tall. The data analysis behind the chart is fascinating and unusual, although I find the chart harder to grasp than expected. The analyst starts with precinct-level data, and determines which precincts were "lop-sided," defined as having a winning margin of over 50 percent for the winner (either Trump or Clinton). The analyst then sums the voters in those lop-sided precincts, and expresses this as a percent of all voters in the state.
For example, in Alabama, the long red bar indicates that about 48% of the state's voters live in lop-sided precincts that went for Trump. It's important to realize that not all such people voted for Trump - they happened to live in precincts that went heavily for Trump. Interestingly, about 12% of the states voters reside in precincts that went heavily for Clinton. Thus, overall, 60% of Alabama's voters live in lop-sided precincts.
This is more sophisticated than the usual analysis that shows up in journalism.
The bar chart may confuse readers for several reasons:
- The horizontal axis is labeled "50-point plus margin for Trump/Clinton" and has values from 0% to 40-60% range. This description seemingly infers the values being plotted as winning margins. However, the sub-header tells readers that the data values are percentages of total voters in the state.
- The shades of colors are not explained. I believe the dark shade indicates the winning party in each state, so Trump won Alabama and Clinton, California. The addition of this information allows the analysis to become multi-dimensional. It also reveals that the designer wants to address how lop-sided precincts affect the outcome of the election. However, adding shade in this manner effectively turns a two-color composition into a four-color composition, adding to the processing load.
- The chart adopts what Howard Wainer calls the "Alabama first" ordering. This always messes up the designer's message because the alphabetical order typically does not yield a meaningful correlation.
The bars are facing out from the middle, which is the 0% line. This arrangement is most often used in a population pyramid, and used when the designer feels it important to let readers compare the magnitudes of two segments of a population. I do not feel that the Democrat versus Republican comparison within each state is crucial to this chart, given that most states were not competitive.
What is more interesting to me is the total proportion of voters who live in these lop-sided precincts. The designer agrees on this point, and employs bar stacking to make this point. This yields some amazing insights here: several Democratic strongholds such as Massachusetts surprisingly have few lop-sided precincts.
Here then is a remake of the chart according to my priorities. Click here for the full chart.
The emphasis is on the total proportion of voters in lop-sided precincts. The states are ordered by that metric from most lop-sided to least. This draws out an unexpected insight: most red states have a relatively high proportion of votesr in lop-sided precincts (~ 30 to 40%) while most blue states - except for the quartet of Maryland, New York, California and Illinois - do not exhibit such demographic concentration.
The gray/grey area offers a counterpoint, that most voters do not live in lop-sided districts.
P.S. I should add that this is one of those chart designs that frustrate standard - I mean, point-and-click - charting software because I am placing the longest bar segments on the left, regardless of color.