##### May 11, 2015

Wall Street Journal uses this paired bar chart to show the favorable/unfavorable ratings of potential GOP candidates for the 2016 presidential elections. (link to original)

This chart form is fine. From this chart, we can easily see which candidates have the strongest favorable ratings. This is precisely how the candidates were sorted (green bars).

But this chart form has one weakness. It's trying to compress three dimensions into one. The dimension of distractors is harder to understand. The gray bars are not sorted, implying that the unfavorable ratings are not well correlated with favorable ratings. There is also a third category (unknowns) that is lurking.

scatter plot would bring out the correlation between favorable and unfavorable more clearly. In the following version, I coded the unknowns in a green color. The lighter the color, the more unknowns.

Most candidates have somewhat more supporters than distractors detractors. Trump and Christie are clearly in trouble, with more distractors than supporters, and few unknowns (dark green). Fiorina, who just entered the race, is also weak though she could recover by winning over the substantial number of unknowns.

The scatter plot takes more effort to understand but, or because, it conveys more information.

You can follow this conversation by subscribing to the comment feed for this post.

It would be safe to ignore anyone, like Fiorina, who has not won an election at something or even served as part of the government and therefore have zero chance of becoming a candidate. Presumably this is part of a longer term plan.

Did you mean detractors rather than distractors?

Steve: thanks!

Note that the "unknown" factor [i.e. 100%-(couldSupport+couldNotSupport)] is just the distance from the hypotenuse. So, do you still need to color it?

A ternary plot would be an interesting variation on this. Same idea but everything scaled into an equilateral triangle, which might be more intuitive, if unfamiliar.
Benefit #1: it avoids the question of the "missing" half of the square in the plot above. An equilateral triangle isn't "missing" anything, and suggests more clearly that everything must add to 1.
Benefit #2: distance from the origin represents "familiarity" more linearly. In either version, their current dots imply their trajectories toward an intersection with that line. But in the current version, a perfect candidate moves 41% farther as they go from 0% to 100% known than does a polarizing candidate. The distortion is only 15% in the equilateral triangle, or could be eliminated entirely if we used a "pizza slice plot." (trademark pending)

(You could also enhance this by plotting and connecting a few points in time, suggesting a narrative, since candidates rarely become less well known in the near term. Would be cleanest with just a few candidates, or if past dots/lines were a subtle gray.)

This is great, thanks!

One reason I suspect they went with the bar chart is that it's easier to make all those eye-catching names have a much bigger font. With the scatter plot, you don't want the names to overwhelm the dots.

Bob H: Not everyone remembers their geometry lesson :)

Jon: I have a love/hate relationship with ternary charts, which I might talk about here in the future. The reason I did the half square is to signal that of the three dimensions, I consider the support/not support as primary and the unknowns as secondary. Also, great idea about extending this plot over time.

PGI: Good idea about font size. I could add another dimension and have the "more important" candidates shown in larger font size. That would probably make Ken happy!

One thing I would do is change the direction of the axes. Since westerners tend to increasing values as positive, I'd make the not support bar start from 0% down to 100% on the Y axis and the support bar on the X-axis. This way you can say that it's "worse" as you go down and "better" as you go across.

While I do think this is an interesting way to look at the data, I feel strongly that for most viewers, this display adds unnecessary complication.

It's all discernible, but it takes a lot more more reading and a lot more thought to process the ranking of each candidate.

I would much prefer to see a chart like the original, with a second plot showing the 'unknown' category for reference.

Example: http://imgur.com/bPpa9AK

[geometry mode off :) ] My previous comment was just that color doesn't provide any extra information.The information encoded in the color can be determined by looking at the distance from the diagonal line in the chart. (I.e. Trump has the smallest "unknown" factor and is closest to that diagonal.)

Given that, my question is "Is it still valuable to use color to duplicate information already there?"

jlbriggs: "I feel strongly that for most viewers, this display adds unnecessary complication.... it takes a lot more more reading and a lot more thought to process the ranking of each candidate."

What if the chart added a big arrow pointing from the lower right to the upper left, labeled "More supporters and fewer detractors"? I think that would make it accessible enough.

jlbriggs: The challenge with bar charts, including multiple bar charts, is that sorting can happen on only one dimension.

The comments to this entry are closed.