Lop-sided precincts, a visual exploration
Oct 17, 2017
In the last post, I discussed one of the charts in the very nice Washington Post feature, delving into polarizing American voters. See the post here. (Thanks again Daniel L.)
Today's post is inspired by the following chart (I am showing only the top of it - click here to see the entire chart):
The chart plots each state as a separate row, so like most such charts, it is tall. The data analysis behind the chart is fascinating and unusual, although I find the chart harder to grasp than expected. The analyst starts with precinct-level data, and determines which precincts were "lop-sided," defined as having a winning margin of over 50 percent for the winner (either Trump or Clinton). The analyst then sums the voters in those lop-sided precincts, and expresses this as a percent of all voters in the state.
For example, in Alabama, the long red bar indicates that about 48% of the state's voters live in lop-sided precincts that went for Trump. It's important to realize that not all such people voted for Trump - they happened to live in precincts that went heavily for Trump. Interestingly, about 12% of the states voters reside in precincts that went heavily for Clinton. Thus, overall, 60% of Alabama's voters live in lop-sided precincts.
This is more sophisticated than the usual analysis that shows up in journalism.
The bar chart may confuse readers for several reasons:
- The horizontal axis is labeled "50-point plus margin for Trump/Clinton" and has values from 0% to 40-60% range. This description seemingly infers the values being plotted as winning margins. However, the sub-header tells readers that the data values are percentages of total voters in the state.
- The shades of colors are not explained. I believe the dark shade indicates the winning party in each state, so Trump won Alabama and Clinton, California. The addition of this information allows the analysis to become multi-dimensional. It also reveals that the designer wants to address how lop-sided precincts affect the outcome of the election. However, adding shade in this manner effectively turns a two-color composition into a four-color composition, adding to the processing load.
- The chart adopts what Howard Wainer calls the "Alabama first" ordering. This always messes up the designer's message because the alphabetical order typically does not yield a meaningful correlation.
The bars are facing out from the middle, which is the 0% line. This arrangement is most often used in a population pyramid, and used when the designer feels it important to let readers compare the magnitudes of two segments of a population. I do not feel that the Democrat versus Republican comparison within each state is crucial to this chart, given that most states were not competitive.
What is more interesting to me is the total proportion of voters who live in these lop-sided precincts. The designer agrees on this point, and employs bar stacking to make this point. This yields some amazing insights here: several Democratic strongholds such as Massachusetts surprisingly have few lop-sided precincts.
Here then is a remake of the chart according to my priorities. Click here for the full chart.
The emphasis is on the total proportion of voters in lop-sided precincts. The states are ordered by that metric from most lop-sided to least. This draws out an unexpected insight: most red states have a relatively high proportion of votesr in lop-sided precincts (~ 30 to 40%) while most blue states - except for the quartet of Maryland, New York, California and Illinois - do not exhibit such demographic concentration.
The gray/grey area offers a counterpoint, that most voters do not live in lop-sided districts.
P.S. I should add that this is one of those chart designs that frustrate standard - I mean, point-and-click - charting software because I am placing the longest bar segments on the left, regardless of color.
Ya know I totally overlooked this chart on my first read through.
Now that you point it out I wonder if the bar chart is even the right medium for the point the creator was trying to make. Although I think your take is much clearer especially because it takes the winner value and explicitly states using the letter on the axis rather than trying to awkwardly encode it in the border of the bar.
Posted by: daniel l | Oct 20, 2017 at 11:13 PM
I wonder what could be done adding a third variable: the geographic position or the demographic size (admitting that you judge addition of a third variable does not overload the chart!).
The last case is simple: you can modify your final chart setting the vertical height of the horizontal bar proportional to each state population (voters) size.
The former case opens several alternatives: chart R lop-sides precints rate vs. geographic position, chart D lop-sides precints rate vs. geographic position, chart R+D lop-sides precints rate vs. geographic position.
Could a chart of R+D lop-sides precints rate (the variable you consider the most important) vs. geographic position integrate a third variable (R vs D)? I have to think about it.
About your P.S., you are right. It could be noted that the default settings can be "tricked" by adding a third series, defined by a simple =IF function.
Posted by: Antonio | Oct 30, 2017 at 05:31 AM