So said a reader, Stephen B., of the following graphic (note: pdf) in the London Times concerning Andy Murray's recent tennis triumphs.
How can we disagree? Shocking? Yes. Failure? Definitely. Failing to communicate? No doubt.
Let's first start with the five tennis balls at the bottom. It fails the self-sufficiency test. It makes no difference whether the balls (bubbles) are the same size, or different sizes. Readers will look at the data and ignore the bubbles.
Amazingly, the caption said that "Murray has one of the best returns of serve in the game." And yet, the graphic showed the five players who were better than Murray, and nobody worse! For those unfamiliar with tennis statistics, it does not provide any helpful statistics like averages, medians, etc. to help us understand the data.
(The color scheme from light to dark: first, second, third, fourth round of tournament)
So we're told: the 75% of first-serve points won in the fourth round was 25.6% of the sum of the percentages of first-serve points won from first to fourth rounds (75%+70%+71%+76%). What does this mean? Why should we care?
The challenge with these two statistics is that they are correlated and have to be interpreted together. If a first-serve is won, then there would be no second serve, etc. Here's one attempt at it, using statistics from the Soderling-Federer match. It's clear that Federer was better on both serves.
Reference: "Murray's march to the last eight", London Times.
IA said "The general idea is that the history of subway ridership tells a story
about the history of a neighborhood that is much richer than the
Okay but what about these sparklines would clarify that history? From what I can tell, this is a case of making the chart and then making sense of it.
The chart designer did make a memorable comment in his blog entry: "Hammer in hand, I of course saw this spreadsheet as a bucket of nails." The hammer is a piece of software he created; the nails, the data of trips taken.
Nathan at FlowingData gave a reluctant passing grade to this Wall Street Journal bubbles chart illustrating the recent U.S. bank "stress" test.
One should fight grade inflation with an iron fist. (Hat tip to Dean Malkiel at Princeton.) A simple profile chart would work nicely since the focus is primarily on ranks. The bubbles, as usual, add nothing to the chart, especially where one can create any kind of dramatic effect by scaling them differently.
Nathan also pointed to the maps of the seven sins, which garnered some national attention. This set of maps is a great illustration of the weakness of maps to study spatial distribution of anything that is highly correlated with population distribution. Do cows have envy too? See related discussion at the Gelman blog.
An amazing amount of data is being visualized here. Mousing on the mapwill pick up the specific data for each county. There is a bar up top for discovering the evolution over time. It would be great if there is an animation button so the map can be played out without clicking. An animated gif will also do (similar to the disease map we featured some time ago).
The colors on the first map represent the origin of the top ethnic group in each county. Within each group, the tint of the color further displays the percentage of the population that group accounts for. The subgroups appear to be 0-2%, 2-5%, over 5%. The last subgroup is very wide.
Not so keen on the second map with all those bubbles. They show the number of people from each country by county. The bubble size is proportional to population. Every version of this map looks the same because the population is concentrated in the cities and the interior is sparsely populated, no matter what ethnic group.
Regardless, this is another laudable effort by the crew at theTimes.
Reference: "Immigration Explorer", New York Times, March 10 2009.
The original graph threw us off our sense of scale. It seemed to be saying all these oil companies are roughly the same size but one grew much faster than the others. The red color and the setting off of the data above the title of the chart seemed to announce some important find.
The junkart version on the right reversed everything to our normal sense of scale. It is a version of the bumps chart, one of my favorites.
So we find that Total is the smallest of these oil companies, about half the size of ExxonMobil -- you wouldn't know that from those abysmal bubbles! Adding to the problem is that the growth data is used to sort the companies while the actual production data is hidden in the data labels.
Total is indeed growing faster but BP is not far behind. The fall of ExxonMobil and Royal Dutch Shell is equally intriguing.
Matt H., who authored the previous post, and Charles G. both pointed to a great example of how people like readers here can make a difference. A bad chart got made over!
The Financial Times
published a chart from a JP Morgan report, using ... you guessed it ...
bubbles to illustrate the deep plunge in market capitalization of many
Some readers at the blog were none too happy with the choice of bubble charts. Among other things, the designer made the common mistake of plotting proportional diameters rather than proportional areas. This is clear from looking at JP Morgan's bubble.
This chart exposes the weakness of bubble charts well. Look at the top row of bubbles. Most of them look so similar it is impossible to know, without spending much time studying the circles, which bank was hurt more.
Eventually, a bar chart was produced. Felix Salmon linked to it. I am not sure how the banks were ordered in this chart. It isn't one of the two obvious dimensions, nor is it in alphabetical order.
In fact, neither the bubble nor the bar chart works well for this case. What we need is the Case-Schiller style asset-bubble life cycle chart. In order to interpret these changes in market caps, we need to know how big was the bubble, and then how steep was the consequent decline. Take a look at our discussion of real estate bubble charts here.
Reference: "Bank capitalization chart of the day", Felix Salmon at Portfolio.com and "Bank picture du jour", FT Alphaville, Jan 21 2008.
Message to readers: I have a large backlog of reader suggestions. Please be patient as I slowly get through them. The frequency of posts will remain lower for the time being as I am busy finalizing a draft of a book. More on that in the near future.
Matt H, a reader, sent in the following entry (with minor edits).
I saw a couple of bad charts on money.cnn.com and thought I'd submit them to you.
They're both part of the same
feature on investment bargains caused by the recession.
It seems to me like both charts would have made their points more
eloquently by using a much simpler, more common form, like a bar
In Chart A, cubes are used to display the difference between
treasury bond yields and AAA municipal bond yields at the two-year
horizon and the ten-year horizon. The volume of each cube corresponds
to the yield for the given type of bond in the given period (I think),
which spreads the one dimension being compared (yields) across three
dimensions, making the differences look smaller than they really are. [...] At the two-year horizon, the two yields being compared are 1.16% for
Treasury bonds and 3.01% for AAA municipal bonds. The yield for AAA
municipal bonds in this case is more than 2.5 times larger than the
yield for Treasury bonds, but the difference doesn't look nearly that
big in the chart provided. [...]
Time out. Let me add that the inadvertent reference to an optical illusion concerning foreground and background! The "outline only" cube on the left should have approximately the same volume as the "solid red" cube on the right (3.01% versus 3.30%) and yet the red cube appeared quite a bit larger because our eyes reacted to the solid color more than thin outlines.
In Chart B, [...] Again, the
metric in question is bond yields: ten-year Treasury bond yields
compared to investment-grade corporate bond yields. The 2008 figure
for each is shown alongside the five-year average. This chart uses the
area of a circle to express these yields, spreading the one-dimensional
value across two dimensions. As in Chart A, the result is a chart in
which the difference between values does not appear as large as it
I will also send
a simple bar chart version of each chart -- the bar charts should illustrate the differences in yields more effectively than the charts actually used in this article.
These are his revised charts:
We can do even better to convert the chart on the right to a time-seriesline chart. Instead of the five-year average, it is better to display the gap beween treasury and corporate bonds for each of the five years plus 2008. This should make for a more eye-catching graphic.
Reference: "Investment in the bargain bin", CNN Money.
Right on the heels of the disastrous bubble chart comes another, courtesy of the NYT Magazine. Bubble charts are okay for the conceptual ("this is really big, and that is really tiny"). This chart wants readers to compare the sizes of the bubbles, which highlights the worst part of such graphs.
Poor scaling is the huge issue with bubble charts. They are the prototype of what I call not "self-sufficient" charts. Without printing all the data, the chart is unscaled, and thus useless (see below middle). When all the data is printed (as in the original, below left), it is no better than a data table.
In the above right chart, we simulated the situation of a bar or column chart, i.e. we provide a scale. For this chart, the convenient "tick marks" are at 10, 20, 34, 41. Unfortunately, this scaled version also fails to amuse.
Note further that the data should have been presented in two sections: the party affiliation analysis and the gender analysis. Also, it is customary to place "Independents" between "Republicans" and "Democrats" because they are middle-of-the-road.
A profile chart is an attractive way to show this data. Here, we quickly learn a couple of things obscured in the bubble chart.
On the issue of abortion, Independents are much closer to Democrats than Republicans. Also, there is barely any difference between the genders, the only difference being the strength of support among those who want to legalize.
Reference: "A matter of Choice", New York Times Magazine, Oct 19 2008.
PS. Based on RichmondTom's suggestion, here are the cumulative profile charts.