Graphing highly structured data
Start at zero, or start at wherever

Getting to first before going to second

Happy holidays to all my readers! A special shutout to those who've been around for over 15 years.


The following enhanced data table appeared in Significance magazine (August 2021) under an article titled "Winning an election, not a popularity contest" (link, paywalled)

Sig_electoralcollege-smIt's surprising hard to read and there are many reasons contributing to this.

First is the antiquated style guide of academic journals, in which they turn legends into text, and insert the text into a caption. This is one of the worst journalistic practices that continue to be followed.

The table shows 50 states plus District of Columbia. The authors are interested in the extreme case in which a hypothetical U.S. presidential candidate wins the electoral college with the lowest possible popular vote margin. If you've been following U.S. presidential politics, you'd know that the electoral college effectively deflates the value of big-city votes so that the electoral vote margin can be a lot larger than the popular vote margin.

The two sub-tables show two different scenarios: Scenario A is a configuration computed by NPR in one of their reports. Scenario B is a configuration created by the authors (Leinwand, et. al.).

The table cells are given one of four colors: green = needed in the winning configuration; white = not needed; yellow = state needed in Scenario B but not in Scenario A; grey = state needed in Scenario A but not in Scenario B.


The second problem is that the above description of the color legend is not quite correct. Green, it turns out, is only correctly explained for Scenario A. Green for Scenario B encodes those states that are needed for the candidate to win the electoral college in Scenario B minus those states that are needed in Scenario B but not in Scenario A (shown in yellow). There is a similar problem with interpreting the white color in the table for Scenario B.

To fix this problem, start with the Q corner of the Trifecta Checkup.


The designer wants to convey an interlocking pair of insights: the winning configuration of states for each of the two scenarios; and the difference between those two configurations.

The problem with the current design is that it elevates the second insight over the first. However, the second insight is a derivative of the first so it's hard to get to the second spot without reaching the first.

The following revision addresses this problem:


[12/30/2021: Replaced chart and corrected the blue arrow for NJ.]




Feed You can follow this conversation by subscribing to the comment feed for this post.


I don't like the Alabama-first sorting. If the story is about the way low-population states get an electoral college boost, maybe they could have sorted by state population, or number of Congressional districts per state? Or just grouped into four categories:

Needed to win in A only
Needed to win in both
Needed to win in B only
Not needed

Obviously my hope is that population-based sortition will visually expose some such grouping naturally anyway.

I also think the scenarios should be directly named "NPR scenario" and "Leinwand et al. scenario," in the column headings, instead of enciphered to "A" and "B" then given a decoding table off to the side.

(shouldn't New Jersey have a blue arrow?)


Derek: Yes about NJ. I found I was missing the gray square on the final check, and then I forgot about the arrow. Will fix!

Agree about naming the scenarios directly.

The problem of using 4 disjoint categories is that it obscures the first question again.

If they are using a table, then in this case, I choose Alabama first. It's just easier to find your state in such an arrangement.

However, I suspect someone will find a non-tabular way of visualizing this dataset, which makes that issue moot.

The comments to this entry are closed.