Statisticians investigate data, and it may seem like missing data should be ignored since no data means no analysis, right? Well, in practice, it turns out that the knowledge that data is missing is very powerful, and statisticians are, in fact, always wary of missingness.
A reader pointed me to Daily Kos for another chart which I'll eventually talk about -- but I got waylaid by this one (shown on the right), depicting the relative proportions of favorable and unfavorable ratings for a set of political players.
But really, the big problem with this chart is not on the page. Alert readers might realize that very few people (in fact, only just over half) have an opinion of John Boehner.
In the following version, the proportion of missing/no opinion/don't know is plotted right beside favorables and unfavorables, revealing that this proportion ranges wildly from only 4% for Obama and Bush to 48% for Boehner.
This is one data set which makes stacked bar charts look better than they typically are. The two main categories of favorable and unfavorable can be stacked to the sides so that they can individually be compared. The middle part, which represents missing data, will usually not provide much information but in this dataset, the gaping blank space makes us think about how we should treat the missing data.
In this chart, we give equal weight to those who have an opinion and those who don't.
Alternatively, we could ignore the people with no opinion, and look at the proportion of favorables and unfavorables among those who have an opinion. There is a danger in doing this because as seen above, the large proportion of don't knows would be hidden from view, and in the case of Boehner, and even for Pelosi and Tea Party, the amount of missing raises interesting questions: have people not heard of these players? are they afraid of providing an opinion? are they conflicted? etc.
Here is the alternative view, in which I have added a couple of comments to highlight things that otherwise would have been missed. One notable feature is that the respondents in this survey essentially view most of these players in similar light (40-50% favorable), as I don't see the differences in the center of the chart as meaningful.