As promised, we stick to bubbles. Like the street artist blowing soap bubbles at passers-by, this map -- published in the Guardian (UK) -- is a gift of bubbles.
And our reader Frederic M. is not amused. "A tremendous failure", he said.
In terms of conveying the data, a simple bar chart would do a better job in exposing the biggest polluters, as well as the relative magnitude between the biggies and the small fish.
The chart reveals more problems if one clicks on, say, Europe, and sees the following:
For starters, compare the bubbles labeled 858, 468, 586, 418 with those labeled 23, 20, 18. And look at the little ones in the periphery labeled 133, 174, 128. Baffling, isn't it?
What they did was to print the ranks for every country, except the top four in Europe for which the ranks are placed next to the country name (in small font), and the actual amounts are placed in the middle of the bubbles. The ranks, of course, are pretty useless, and they obliterate the scale of the differences between countries.
Besides, the bigger the polluter, the smaller the rank but the larger the bubble. This built-in disconnect can also be disorienting.
Every bubble chart typically contains lots of data labels, and the reason is that the bubble form lacks self-sufficiency. Without the data labels, the reader has trouble comparing the areas.
Reference: "The Carbon Atlas", Guardian, Dec 9 2008.
I finally checked the Junk Charts mailbox again, and I found an uprising against bubble charts and pie charts. It appears that despite their shortcomings amply demonstrated here and elsewhere, editors everywhere continue to believe that the public has a lovefest with these creatures.
I will start off the parade with this one from the Wall Street Journal, purportedly showing that the Bank of England has continued to inject cash into the economy, and at ever increasing rates. The headline said Bank of England to expand bond-buy plan.
This chart has a variety of problems, in addition to the use of overlapping bubbles. As has been documented, it is almost impossible to gauge the relative sizes of circular areas, especially when they are overlapping.
If we remove all but one of the data labels, the chart is non-functional. This is what we mean by not self-sufficient: the interpretation of this chart requires, indeed demands, that all the underlying data be printed on the same chart. The only way readers can understand what is going on is by reading the data itself!
The horizontal axis (indicating time) is also non sensical. The separation from month to month is variable. Besides, and this is the key flaw of the chart, the projected number is a three-month total cumulative growth being treated like a monthly figure.
Since the Bank is projected to inject 175 50 billion extra pounds in the next three months, that would work out to be roughly 60 16 billion per month. That would turn the story upside down: one would conclude that the Bank is gradually slowing the rate of injection. The following bar chart points this out with little fuss:
When bars are used, there is no need to print every single data point. The relative lengths of the bars can be estimated easily. The months are equally spaced.
One final point: the exchange rate cited is not very helpful. What would have been more useful for readers would be the scale of the cash injection with respect to each nation's GDP.
Reference: "Bank of England Expands Bond-Buy Plan", Wall Street Journal, Aug 7 2009.
PS. Per Andrew's comment, here is a line chart, where the growth/decline in the injection is encoded in the slope of the line segments:
So said a reader, Stephen B., of the following graphic (note: pdf) in the London Times concerning Andy Murray's recent tennis triumphs.
How can we disagree? Shocking? Yes. Failure? Definitely. Failing to communicate? No doubt.
Let's first start with the five tennis balls at the bottom. It fails the self-sufficiency test. It makes no difference whether the balls (bubbles) are the same size, or different sizes. Readers will look at the data and ignore the bubbles.
Amazingly, the caption said that "Murray has one of the best returns of serve in the game." And yet, the graphic showed the five players who were better than Murray, and nobody worse! For those unfamiliar with tennis statistics, it does not provide any helpful statistics like averages, medians, etc. to help us understand the data.
(The color scheme from light to dark: first, second, third, fourth round of tournament)
So we're told: the 75% of first-serve points won in the fourth round was 25.6% of the sum of the percentages of first-serve points won from first to fourth rounds (75%+70%+71%+76%). What does this mean? Why should we care?
The challenge with these two statistics is that they are correlated and have to be interpreted together. If a first-serve is won, then there would be no second serve, etc. Here's one attempt at it, using statistics from the Soderling-Federer match. It's clear that Federer was better on both serves.
Reference: "Murray's march to the last eight", London Times.
IA said "The general idea is that the history of subway ridership tells a story
about the history of a neighborhood that is much richer than the
Okay but what about these sparklines would clarify that history? From what I can tell, this is a case of making the chart and then making sense of it.
The chart designer did make a memorable comment in his blog entry: "Hammer in hand, I of course saw this spreadsheet as a bucket of nails." The hammer is a piece of software he created; the nails, the data of trips taken.
Nathan at FlowingData gave a reluctant passing grade to this Wall Street Journal bubbles chart illustrating the recent U.S. bank "stress" test.
One should fight grade inflation with an iron fist. (Hat tip to Dean Malkiel at Princeton.) A simple profile chart would work nicely since the focus is primarily on ranks. The bubbles, as usual, add nothing to the chart, especially where one can create any kind of dramatic effect by scaling them differently.
Nathan also pointed to the maps of the seven sins, which garnered some national attention. This set of maps is a great illustration of the weakness of maps to study spatial distribution of anything that is highly correlated with population distribution. Do cows have envy too? See related discussion at the Gelman blog.
An amazing amount of data is being visualized here. Mousing on the mapwill pick up the specific data for each county. There is a bar up top for discovering the evolution over time. It would be great if there is an animation button so the map can be played out without clicking. An animated gif will also do (similar to the disease map we featured some time ago).
The colors on the first map represent the origin of the top ethnic group in each county. Within each group, the tint of the color further displays the percentage of the population that group accounts for. The subgroups appear to be 0-2%, 2-5%, over 5%. The last subgroup is very wide.
Not so keen on the second map with all those bubbles. They show the number of people from each country by county. The bubble size is proportional to population. Every version of this map looks the same because the population is concentrated in the cities and the interior is sparsely populated, no matter what ethnic group.
Regardless, this is another laudable effort by the crew at theTimes.
Reference: "Immigration Explorer", New York Times, March 10 2009.
The original graph threw us off our sense of scale. It seemed to be saying all these oil companies are roughly the same size but one grew much faster than the others. The red color and the setting off of the data above the title of the chart seemed to announce some important find.
The junkart version on the right reversed everything to our normal sense of scale. It is a version of the bumps chart, one of my favorites.
So we find that Total is the smallest of these oil companies, about half the size of ExxonMobil -- you wouldn't know that from those abysmal bubbles! Adding to the problem is that the growth data is used to sort the companies while the actual production data is hidden in the data labels.
Total is indeed growing faster but BP is not far behind. The fall of ExxonMobil and Royal Dutch Shell is equally intriguing.
Matt H., who authored the previous post, and Charles G. both pointed to a great example of how people like readers here can make a difference. A bad chart got made over!
The Financial Times
published a chart from a JP Morgan report, using ... you guessed it ...
bubbles to illustrate the deep plunge in market capitalization of many
Some readers at the blog were none too happy with the choice of bubble charts. Among other things, the designer made the common mistake of plotting proportional diameters rather than proportional areas. This is clear from looking at JP Morgan's bubble.
This chart exposes the weakness of bubble charts well. Look at the top row of bubbles. Most of them look so similar it is impossible to know, without spending much time studying the circles, which bank was hurt more.
Eventually, a bar chart was produced. Felix Salmon linked to it. I am not sure how the banks were ordered in this chart. It isn't one of the two obvious dimensions, nor is it in alphabetical order.
In fact, neither the bubble nor the bar chart works well for this case. What we need is the Case-Schiller style asset-bubble life cycle chart. In order to interpret these changes in market caps, we need to know how big was the bubble, and then how steep was the consequent decline. Take a look at our discussion of real estate bubble charts here.
Reference: "Bank capitalization chart of the day", Felix Salmon at Portfolio.com and "Bank picture du jour", FT Alphaville, Jan 21 2008.
Message to readers: I have a large backlog of reader suggestions. Please be patient as I slowly get through them. The frequency of posts will remain lower for the time being as I am busy finalizing a draft of a book. More on that in the near future.
Matt H, a reader, sent in the following entry (with minor edits).
I saw a couple of bad charts on money.cnn.com and thought I'd submit them to you.
They're both part of the same
feature on investment bargains caused by the recession.
It seems to me like both charts would have made their points more
eloquently by using a much simpler, more common form, like a bar
In Chart A, cubes are used to display the difference between
treasury bond yields and AAA municipal bond yields at the two-year
horizon and the ten-year horizon. The volume of each cube corresponds
to the yield for the given type of bond in the given period (I think),
which spreads the one dimension being compared (yields) across three
dimensions, making the differences look smaller than they really are. [...] At the two-year horizon, the two yields being compared are 1.16% for
Treasury bonds and 3.01% for AAA municipal bonds. The yield for AAA
municipal bonds in this case is more than 2.5 times larger than the
yield for Treasury bonds, but the difference doesn't look nearly that
big in the chart provided. [...]
Time out. Let me add that the inadvertent reference to an optical illusion concerning foreground and background! The "outline only" cube on the left should have approximately the same volume as the "solid red" cube on the right (3.01% versus 3.30%) and yet the red cube appeared quite a bit larger because our eyes reacted to the solid color more than thin outlines.
In Chart B, [...] Again, the
metric in question is bond yields: ten-year Treasury bond yields
compared to investment-grade corporate bond yields. The 2008 figure
for each is shown alongside the five-year average. This chart uses the
area of a circle to express these yields, spreading the one-dimensional
value across two dimensions. As in Chart A, the result is a chart in
which the difference between values does not appear as large as it
I will also send
a simple bar chart version of each chart -- the bar charts should illustrate the differences in yields more effectively than the charts actually used in this article.
These are his revised charts:
We can do even better to convert the chart on the right to a time-seriesline chart. Instead of the five-year average, it is better to display the gap beween treasury and corporate bonds for each of the five years plus 2008. This should make for a more eye-catching graphic.
Reference: "Investment in the bargain bin", CNN Money.