Every chart, even if the dataset is small, deserves care. Long-time reader zbicyclist submits the following, which illustrates this point well.
The following comments are by zbicyclist:
This is from http://win.niddk.nih.gov/statistics/ -- from the National Institute of Diabetes and Kidney Diseases, part of the U.S. National Institutes of Health.
The pie chart is terrible in a pedestrian way – a bar chart could be so much clearer, or even a table. You have to do too much work to match up the colors, numbers and labels on the pie chart.
To the right of the pie is a bar chart, but a bar chart in which the categories are nested – extreme obesity is part of obesity, extreme obesity and obesity are part of overweight or obesity. If we want to do something like this, there should be 3 charts (e.g. space on the x axis indicating a break). The normal expectation for a bar graph is that the categories are mutually exclusive. This problem is repeated in the Race/Ethnicity graph just below these.
Now, some comments by me.
Another issue of the design is inconsistency. The same color scheme is used in both charts but to connotate different concepts.
Put yourself at the moment when you just understood the chart on the left side. You figured out that obesity is deep green while extreme obesity is light green. Now you shifted your attention to the column chart. You were expecting the light green columns to indicate extreme obesity, and the deep green, obesity. And yet, the light/dark green represents a male-female split.
Here is a stacked column chart showing that females are more likely than males to be either extremely obese or not overweight. In other words, the female distribution has "fatter tails".
I learned the most upsetting thing about this chart when re-making it: the listed percentages on the pie chart added up to 106 percent.
This is a case of the chart telling a different story from the data. Let's look at one of the charts, piece by piece.
The first pie(ce) suggests that methane and carbon dioxide (CO2) adds up to some total. That is the only way to read a pie chart. A pie chart shows components of a whole.
What is the whole? It's hard to interpret without some explanation. The title at the bottom says "Radiative Forcing change over the last 30 years" with a footnote disclosing... hold your breath... "Radiative forcings from other gases and human impact are not shown."
In other words, the visual object says that Radiative forcing from CO2 is about 5 times larger than that of Methane. A column chart would have displayed this relative scale more clearly.
But that chart is only one of a pair. Here is the whole picture:
This pair tells a particular story: Methane was a much larger share of something in the past and is predicted to become an almost irrelevant share of something in the future.
But such an interpretation would almost surely be wrong. The designer left a misleading cue here, which is to show two pies of equal size. There is just no conceivable way that the total "radiative forcing change" is identical in the last 30 years to that in the next 30 years.
The second pie chart also has a footnote. A better person can help me interpret what the following sentence means:
The radiative forcing that our current emissions have committed us to, 20 years from now, is based on a 300-year initial drawdown time scale for carbon dioxide, and 12 years for methane
I'm sure these words say something to a climate expert but this attempt stinks as a piece of public communication.
Returning to the equal-size pies for a moment. Since all other factors are removed, the chart only shows us the relative impact of Methane versus Carbon dioxide. If the data are to be believed, then the scale of the impact of Methane is expected to become much smaller relative to that of CO2 in the next 30 years. This does not imply that the absolute impact of Methane will be lower in the future than in the past.
There are three possible stories, all consistent with the above chart:
1) the absolute impact of Methane declines while the absolute impact of CO2 increases, and thus the relative impact of Methane decreases drastically
2) the absolute impacts of both decline but the impact of Methane declines a lot more
3) the absolute impacts of both increase but the increase of Methane's impact grows a lot more slowly
It is the designer's job to make it clear to readers the story of the data.
The fact that the entire blog post contains a PDF image and no words is either laziness or arrogance. The title of the piece is "the story of methane, in five pie charts". I don't know what the story of methane is. I doubt that the intention of the author was to tell us that methane is extremely unimportant relative to CO2.
PS. Steven below linked to a response from RealClimate.org. They confirm that the "story of methane" is that it is unimportant relative to CO2. Perhaps they should have called it the "non-story of methane". They see no problem with these pie charts.
Carl Bialik used to be the Numbers Guy at Wall Street Journal - he's now with FiveThirtyEight. Apparently, he left a huge void. John Eppley sent me to this set of charts via Twitter.
This chart about Citibike is very disappointing.
Using the Trifecta checkup, I first notice that it addresses a stale question and produces a stale answer. The caption below the chart says "the peak times ... seem to be around 9 am and 6 pm." What a shock!
I sense a degree of meekness in usnig "seem to be". There is not much to inspire confidence in the data: rather than the full statistics which you'd think someone at Citibike has, the chart is based on "a two-day sample last autumn". The number of days is less concerning than the question of whether those two autumn days are representative of the year. Curious readers might want to know what data was collected, how it was collected, and the sample size.
Finally, the graph makes a mess of the data. While the black line appears to be data-rich, it is not. In fact, the blue dots might as well be randomly scattered and connected. As you can see from the annotations below, the scale of the chart makes no sense.
Plus, the execution is sloppy, with a missing data label.
The next chart is not much better.
The biggest howler is the choice of pie charts to illustrate three numbers that are not that different.
But I have to say the chart raises more questions than it answers. I am not an expert in pregnancy but doesn't a pregnant woman's weight include the weight of the baby she's carrying? So the more weight the woman gains, on average, the heavier is her baby. What a shock!
The last and maybe the least is this chart about basketball players in the playoff.
It's the dreaded bubble chart. The players are arranged in a perplexing order. I wonder if there is a natural numbering system for basketball positions (center = #1, etc.), like there is in soccer. Even if there is such a natural numbering system, I still question the decision to confound that system with a complicated ranking of current-year playoff players against all-time players.
Above all, the question being asked is uninteresting, and so the chart is uninformative. A more interesting question to me is whether the best players are playing in this year's playoff. To answer this question, the designer should be comparing only currently active players, and showing the all-time ranks of those players who are playing in the playoffs versus those who aren't.