But or because more information
May 11, 2015
Wall Street Journal uses this paired bar chart to show the favorable/unfavorable ratings of potential GOP candidates for the 2016 presidential elections. (link to original)
This chart form is fine. From this chart, we can easily see which candidates have the strongest favorable ratings. This is precisely how the candidates were sorted (green bars).
But this chart form has one weakness. It's trying to compress three dimensions into one. The dimension of distractors is harder to understand. The gray bars are not sorted, implying that the unfavorable ratings are not well correlated with favorable ratings. There is also a third category (unknowns) that is lurking.
A scatter plot would bring out the correlation between favorable and unfavorable more clearly. In the following version, I coded the unknowns in a green color. The lighter the color, the more unknowns.
Most candidates have somewhat more supporters than distractors detractors. Trump and Christie are clearly in trouble, with more distractors than supporters, and few unknowns (dark green). Fiorina, who just entered the race, is also weak though she could recover by winning over the substantial number of unknowns.
The scatter plot takes more effort to understand but, or because, it conveys more information.
It would be safe to ignore anyone, like Fiorina, who has not won an election at something or even served as part of the government and therefore have zero chance of becoming a candidate. Presumably this is part of a longer term plan.
Posted by: Ken | May 11, 2015 at 09:16 AM
Did you mean detractors rather than distractors?
Posted by: Steve | May 11, 2015 at 09:32 AM
Steve: thanks!
Posted by: junkcharts | May 12, 2015 at 12:50 AM
Note that the "unknown" factor [i.e. 100%-(couldSupport+couldNotSupport)] is just the distance from the hypotenuse. So, do you still need to color it?
Posted by: Bob H | May 12, 2015 at 10:00 AM
A ternary plot would be an interesting variation on this. Same idea but everything scaled into an equilateral triangle, which might be more intuitive, if unfamiliar.
Benefit #1: it avoids the question of the "missing" half of the square in the plot above. An equilateral triangle isn't "missing" anything, and suggests more clearly that everything must add to 1.
Benefit #2: distance from the origin represents "familiarity" more linearly. In either version, their current dots imply their trajectories toward an intersection with that line. But in the current version, a perfect candidate moves 41% farther as they go from 0% to 100% known than does a polarizing candidate. The distortion is only 15% in the equilateral triangle, or could be eliminated entirely if we used a "pizza slice plot." (trademark pending)
(You could also enhance this by plotting and connecting a few points in time, suggesting a narrative, since candidates rarely become less well known in the near term. Would be cleanest with just a few candidates, or if past dots/lines were a subtle gray.)
Posted by: Jon | May 12, 2015 at 04:58 PM
This is great, thanks!
One reason I suspect they went with the bar chart is that it's easier to make all those eye-catching names have a much bigger font. With the scatter plot, you don't want the names to overwhelm the dots.
Posted by: PerfctlyGoodInk | May 13, 2015 at 02:41 PM
Bob H: Not everyone remembers their geometry lesson :)
Jon: I have a love/hate relationship with ternary charts, which I might talk about here in the future. The reason I did the half square is to signal that of the three dimensions, I consider the support/not support as primary and the unknowns as secondary. Also, great idea about extending this plot over time.
PGI: Good idea about font size. I could add another dimension and have the "more important" candidates shown in larger font size. That would probably make Ken happy!
Posted by: Kaiser | May 13, 2015 at 02:56 PM
One thing I would do is change the direction of the axes. Since westerners tend to increasing values as positive, I'd make the not support bar start from 0% down to 100% on the Y axis and the support bar on the X-axis. This way you can say that it's "worse" as you go down and "better" as you go across.
Posted by: Nate | May 14, 2015 at 12:26 PM
While I do think this is an interesting way to look at the data, I feel strongly that for most viewers, this display adds unnecessary complication.
It's all discernible, but it takes a lot more more reading and a lot more thought to process the ranking of each candidate.
I would much prefer to see a chart like the original, with a second plot showing the 'unknown' category for reference.
Example: http://imgur.com/bPpa9AK
Posted by: jlbriggs | May 14, 2015 at 01:50 PM
[geometry mode off :) ] My previous comment was just that color doesn't provide any extra information.The information encoded in the color can be determined by looking at the distance from the diagonal line in the chart. (I.e. Trump has the smallest "unknown" factor and is closest to that diagonal.)
Given that, my question is "Is it still valuable to use color to duplicate information already there?"
Posted by: Bob H | May 15, 2015 at 10:56 AM
jlbriggs: "I feel strongly that for most viewers, this display adds unnecessary complication.... it takes a lot more more reading and a lot more thought to process the ranking of each candidate."
What if the chart added a big arrow pointing from the lower right to the upper left, labeled "More supporters and fewer detractors"? I think that would make it accessible enough.
Posted by: perfectlyGoodInk | May 18, 2015 at 05:05 PM
jlbriggs: The challenge with bar charts, including multiple bar charts, is that sorting can happen on only one dimension.
Posted by: Kaiser | May 20, 2015 at 11:32 AM