The richness of nothingness
Dec 15, 2010
Statisticians investigate data, and it may seem like missing data should be ignored since no data means no analysis, right? Well, in practice, it turns out that the knowledge that data is missing is very powerful, and statisticians are, in fact, always wary of missingness.
A reader pointed me to Daily Kos for another chart which I'll eventually talk about -- but I got waylaid by this one (shown on the right), depicting the relative proportions of favorable and unfavorable ratings for a set of political players.
The data is simple, and the chart is sufficient although I'd avoid the blue/red coloring which connotes party affiliation in American politics. The graph also fails our self-sufficiency test.
But really, the big problem with this chart is not on the page. Alert readers might realize that very few people (in fact, only just over half) have an opinion of John Boehner.
In the following version, the proportion of missing/no opinion/don't know is plotted right beside favorables and unfavorables, revealing that this proportion ranges wildly from only 4% for Obama and Bush to 48% for Boehner.
This is one data set which makes stacked bar charts look better than they typically are. The two main categories of favorable and unfavorable can be stacked to the sides so that they can individually be compared. The middle part, which represents missing data, will usually not provide much information but in this dataset, the gaping blank space makes us think about how we should treat the missing data.
In this chart, we give equal weight to those who have an opinion and those who don't.
Alternatively, we could ignore the people with no opinion, and look at the proportion of favorables and unfavorables among those who have an opinion. There is a danger in doing this because as seen above, the large proportion of don't knows would be hidden from view, and in the case of Boehner, and even for Pelosi and Tea Party, the amount of missing raises interesting questions: have people not heard of these players? are they afraid of providing an opinion? are they conflicted? etc.
Here is the alternative view, in which I have added a couple of comments to highlight things that otherwise would have been missed. One notable feature is that the respondents in this survey essentially view most of these players in similar light (40-50% favorable), as I don't see the differences in the center of the chart as meaningful.
I'd avoid the blue/red coloring which connotes party affiliation in American politics.
I actually have been struggling with this for sometime. Not specifically the political colors, but just in general: dealing with (or not) colors with coded meanings.
On numerous occasions I've come across circumstance where I've been tempted to use red and green. Red is, of course, bad while green is good. Right?
Trouble is, I feel like that kind of editorializes the data and unless there is a very clear definition on what is good and what is bad, I'm not totally comfortable with using it.
This gets even crazier when people get hyped up on their 'stop light' concept, and want red/green/orange, whereas wtf does orange actually mean?
Is it better just to use non-meaning driven colors in general?
Posted by: dan l | Dec 15, 2010 at 12:42 PM
Self-sufficiency test hyperlink appears to be broken.
I'm guessing it should link to http://junkcharts.typepad.com/junk_charts/2005/10/the_selfsuffici.html
Posted by: Chris Pudney | Dec 15, 2010 at 08:53 PM
@dan l: A far simpler reason not to use red/green combinations is that people with colour blindness cannot tell the difference.
Also, this has application to people with normal vision... I recall a study where users where shown two colour swatches one red and one not, and asked to click on the red one (which could be either side). Their response times were measured, and then plotted against the hue.
What came out is that people spent a longer time with green then they did with other colours (except those shades very close to red). So, the human brain finds red/green a harder combination to differentiate than red/blue or even red/pink.
Posted by: Tom West | Dec 16, 2010 at 12:03 PM
Different cultures also have different color meanings. Be aware of the audience...
Posted by: MJ Schettler | Jun 15, 2011 at 10:26 AM