Egregious chart brings back bad memories

My friend Alberto Cairo said it best: if you see bullshit, say "bullshit!"

He was very incensed by this egregious "infographic": (link to his post)


Emily Schuch provided a re-visualization:


The new version provides a much richer story of how Planned Parenthood has shifted priorities over the last few years.

It also exposed what the AUL (American United for Life) organization distorted the story.

The designer extracted only two of the lines, thus readers do not see that the category of services that has really replaced the loss of cancer screening was STI/STD testing and treatment. This is a bit ironic given the other story that has circulated this week - the big jump in STD among Americans (link).

Then, the designer placed the two lines on dual axes, which is a dead giveaway that something awful lies beneath.

Further, this designer dumped the data from intervening years, and drew a straight line from the first to the last year. The straight arrow misleads by pretending that there has been a linear trend, and that it would go on forever.

But the masterstroke is in the treatment of the axes. Let's look at the axes, one at a time:

The horizontal axis: Let me recap. The designer dumped all but the starting and ending years, and drew a straight line between the endpoints. While the data are no longer there, the axis labels are retained. So, our attention is drawn to an area of the chart that is void of data.

The vertical axes: Let me recap. The designer has two series of data with the same units (number of people served) and decided to plot each series on a different scale with dual axes. But readers are not supposed to notice the scales, so they do not show up on the chart.

To summarize, where there are no data, we have a set of functionless labels; where labels are needed to differentiate the scales, we have no axes.


This is a tried-and-true tactic employed by propagandists. The egregious chart brings back some bad memories.

Here is a long-ago post on dual axes.

Here is Thomas Friedman's use of the same trick.

More chart drama, and data aggregation

Robert Kosara posted a response to my previous post.

He raises an important issue in data visualization - the need to aggregate data, and not plot raw data. I have no objection to that point.

What was shown in my original post are two extremes. The bubble chart is high drama at the expense of data integrity. Readers cannot learn any of the following from that chart:

  • the shape of the growth and subsequent decline of the flu epidemic
  • the beginning and ending date of the epidemic
  • the peak of the epidemic*

* The peak can be inferred from the data label, although there appears to be at least one other circle of approximately equal size, which isn't labeled.

The column chart is low drama but high data integrity. To retain some dramatic element, I encoded the data redundantly in the color scale. I also emulated the original chart in labeling specific spikes.

The designer then simply has to choose a position along these two extremes. This will involve some smoothing or aggregation of the data. Robert showed a column chart that has weekly aggregates, and in his view, his version is closer to the bubble chart.

Robert's version indeed strikes a balance between drama and data integrity, and I am in favor of it. Here is the idea (I am responsible for the added color).



Where I depart from Robert is how one reads a column chart such as the one I posted:


Robert thinks that readers will perceive each individual line separately, and in so doing, "details hide the story". When I look at a chart like this, I am drawn to the envelope of the columns. The lighter colors are chosen for the smaller spikes to push them into the background. What might be the problem are those data labels identifying specific spikes; they are a holdover from the original chart--I actually don't know why those specific dates are labeled.


In summary, the key takeaway is, as Robert puts it:

the point of this [dataset] is really not about individual days, it’s about the grand totals and the speed with which the outbreak happened.

We both agree that the weekly version is the best among these. I don't see how the reader can figure out grand totals and speed with which the outbreak happened by staring at those dramatic but overlapping bubbles.

Is it worth the drama?

Quite the eye-catching chart this:


The original accompanied this article in the Wall Street Journal about avian flu outbreaks in the U.S.

The point of the chart appears to be the peak in the flu season around May. The overlapping bubbles were probably used for drama.

A column chart, with appropriate colors, attains much of the drama but retains the ability to read the data.



Tricky boy William

Last week, I was quite bothered by this chart I produced using the Baby Name Voyager tool.


According to this chart, William has drastically declined in popularity over time. The name was 7 times more popular back in the 1880s compared to the 2010s. And yet, when I hovered over the chart, the rank of William in 2013 was 3. Apparently, William was the 3rd most popular boy name in 2013.

I wrote the nice people at the website and asked if there might be a data quality issue, and their response was:

The data in our Name Voyager tool is correct. While it may be puzzling, there are definitely less Williams in the recent years than there were in the past (1880s). Although the name is still widely popular, there are plenty of other baby names that parents are using. In the past, there were a limited amount of names that parents would choose, therefore more children had the same name.

What bothered me was that the rate has declined drastically while the number of births was increasing. So, I was expecting William to drop in rank as well. But their explanation makes a lot of sense: if there is a much wider spread of names in recent times, the rank could indeed remain top. It was very nice of them to respond.


There are three ways to present this data series, as shown below. One can show the raw counts of William babies (orange line). One can show the popularity against total births (what Baby Name Wizard shows, blue line). One can show the rank of William relative to all other male baby names (green line). Consider how different these three lines look!


The rate metric (per million births) adjusts for growth in total births. But the blue line is difficult to interpret in the face of the orange line. In the period 1900 to 1950, the actual number of William babies went up but the blue line came down. The rank is also tough especially in the 1970-2000 period when it took a dive, a trend not visible in either the raw counts or the adjusted counts.

Adding to the difficulty is the use of the per-million metric. In the following chart, I show three different scales for popularity: per million, per 100,000, and per 100 (i.e. proportion). The raw count is shown up top.


All three blue lines are essentially the same but how readers interpret the scales is quite another matter. The per-million births metric is the worst of the lot. The chart shows values in the 20,000-25,000 range in the 1910s but the actual number of William babies was below 20,000 for a number of years. Switching to per-100K helps but in this case, using the standard proportion (the bottom chart) is more natural.


The following scatter plot shows the strange relationship between the rate of births and the rank over time for Williams babies.


Up to 1990s, there is an intuitive relationship: as the proportion of Williams among male babies declined, so did the rank of William. Then in the 1990s and beyond, the relationship flipped. The proportion of Williams among male babies continued to drop but the rank of William actually recovered!



There are no easy charts

Every chart, even if the dataset is small, deserves care. Long-time reader zbicyclist submits the following, which illustrates this point well.


The following comments are by zbicyclist:

This is from  -- from the National Institute of Diabetes and Kidney Diseases, part of the U.S. National Institutes of Health.
The pie chart is terrible in a pedestrian way – a bar chart could be so much clearer, or even a table. You have to do too much work to match up the colors, numbers and labels on the pie chart.

To the right of the pie is a bar chart, but a bar chart in which the categories are nested – extreme obesity is part of obesity, extreme obesity and obesity are part of overweight or obesity.  If we want to do something like this, there should be 3 charts (e.g. space on the x axis indicating a break). The normal expectation for a bar graph is that the categories are mutually exclusive.  This problem is repeated in the Race/Ethnicity graph just below these.


Now, some comments by me.

Another issue of the design is inconsistency. The same color scheme is used in both charts but to connotate different concepts.


Put yourself at the moment when you just understood the chart on the left side. You figured out that obesity is deep green while extreme obesity is light green. Now you shifted your attention to the column chart. You were expecting the light green columns to indicate extreme obesity, and the deep green, obesity. And yet, the light/dark green represents a male-female split.

Here is a stacked column chart showing that females are more likely than males to be either extremely obese or not overweight. In other words, the female distribution has "fatter tails".


I learned the most upsetting thing about this chart when re-making it: the listed percentages on the pie chart added up to 106 percent.


Losing sleep over schedules

Fan of the blog, John H., made a JunkCharts-style post about a chart that has been picked as a "Best of" for 2014 by Fast Company (link). I agree with him. It seems more fit to be on the "Worst of" list. Here it is:


As John pointed out, the outside yellow arc (Beethoven) and the inside green arc (Simenon) present, shockingly, the same exact sleep schedule (10 pm to 6 am).

John unrolled the arcs and used R to make this version:


Go here to read John's entire post.


Another improvement is to add a "control". One way to understand how unusual these sleep patterns are is to compare them to the average person.

I'm also a little dubious as to the reliability of this data. How do we know their sleep schedules? And how variant were their schedules?

If I rate this via the Trifecta Checkup, I'd classify this as Type DV.



Relevance, to you or me: a response to Cairo

Alberto Cairo discussed a graphic by the New York Times on the slowing growth of Medicare spending (link).

Medicarespend_combinedThe chart on the top is published, depicting the quite dramatic flattening of the growth in average spending over the last years--average being the total spend divided by the number of Medicare recipients. The other point of the story is that the decline is unexpected, in the literal sense that the Congressional Budget Office planners did not project its magnitude. (The planners did take the projections down over time so they did project the direction correctly.)

Meanwhile, Cairo asked for a chart of total spend, and Kevin Quealy obliged with the chart shown at the bottom. It shows almost straight line growth.

Cairo's point is that the average does not give the full picture, and we should aim to "show all the relevant data".


I want to follow that line of thinking further.

My first reaction is Cairo did not say "show all the data", he said "show the relevant data".  That is a crucial difference. For complex social problems like Medicare, and in general, for "Big Data", it is not wise to show all the data. Pick out the data of interest, and focus on those.

A second reaction. How can "relevance" be defined? Doesn't it depend on what the question is? Doesn't it depend on the interests and persuasion of the chart designer (or reader)? One of the key messages I wish to impart in my book Numbersense (link) is that reasonable people using uncontroversial statistical methods to analyze the same dataset can come to different, even opposite, conclusions. 

Statistical analysis is concerned with figuring what is relevant and what isn't. This is no different from Nate Silver's choice of signal versus noise. Noise is not just what is bad but also what is irrelevant.

In practice, you present what is relevant to your story. Someone else will do the same. The particular parts of the data that support each story may be different. The two sides have to engage each other, and debate which story has a greater chance of being close to the truth. If the "truth" can be verified in the future, the debate is more easily settled.

Unfortunately, there is no universal standard of relevance.


Going back to the NYT story. The chart on total Medicare spending is not as useful as it may seem. This is because an aggregate metric like this for a social phenomenon is influenced by a multitude of factors. Clearly, population growth is a notable factor here. When they use the word "real", I don't know if this means actualized (as opposed to projected), or "in real terms" (that is, inflation adjusted). If not the latter, the value of money would be another factor affecting our interpretation of the lines.

Without some reference levels for population and value of money, it is hard to interpret whether the straight-line growth implies higher or lower spending intensity. For the second chart, I suggest plotting the growth in the number of Medicare recipients. I believe one of the goals of the Affordable Care Act is to reduce the ranks of the uninsured so a direct depiction of this result is interesting.

The average spend can be thought of as population-adjusted. It is a more interpretable number -- but as Cairo pointed out, it is also narrow in scope. This is a tradeoff inherent in all of statistics. To grow understanding, we narrow the scope; but as we focus, we lose the big picture. So, we compile a set of focal points to paint a fuller picture.



A small step for interactivity

Alberto links to a nice Propublica chart on average annual spend per dialysis patient on ambulances by state. (link to chart and article)


It's a nice small-multiples setup with two tabs, one showing the states in order of descending spend and the other, alphabetical.

In the article itself, they excerpt the top of the chart containing the states that have suspiciously high per-patient spend.

Several types of comparisons are facilitated: comparison over time within each state, comparison of each state against the national average, comparison of trend across states, and comparison of state to state given the year.

The first comparison is simple as it happens inside each chart component.

The second type of comparison is enabled by the orange line being replicated on every component. (I'd have removed the columns from the first component as it is both redundant and potentially confusing, although I suspect that the designer may need it for technical reasons.)

The third type of comparison is also relatively easy. Just look at the shape of the columns from one component to the next.

The fourth type of comparison is where the challenge lies for any small-multiples construction. This is also a secret of this chart. If you mouse over any year on any component, every component now highlights that particular year's data so that one can easily make state by state comparisons. Like this for 2008:


You see that every chart now shows 2008 on the horizontal axis and the data label is the amount for 2008. The respective columns are given a different color. Of course, if this is the most important comparison, then the dimensions should be switched around so that this particular set of comparisons occurs within a chart component--but obviously, this is a minor comparison so it gets minor billing.


I love to see this type of thoughtfulness! This is an example of using interactivity in a smart way, to enhance the user experience.

The Boston subway charts I featured before also introduce interactivity in a smart way. Make sure you read that post.

Also, I have a few comments about the data analysis on the sister blog.

Light entertainment: famous people, sleep, publication bias

Bernard L. tipped us about this "infographic":


The chart is missing a title. The arcs present "sleep schedules" for the named people. The "data" comes from a book. I wonder about the accuracy of such data.

Also note the inherent "publication bias". People who do not follow a rigid schedule will not be able to describe a sleep schedule, thus taking themselves out of the chart.