## Blowing the whistle at bubble charts

##### Mar 12, 2013

The bubble chart is one of the most hopeless data graphics ever invented. It is sometimes useful for conceptual charts but trying to express data with it is a lost cause.

The Wall Street Journal used a bubble chart to show the trend in whistle-blower lawsuits in the U.S. The original chart looks like this:

Focus on the top part of the chart. Now apply the self-sufficiency test (link), as follows:

First, cover up the data labels. You'll notice that no information is conveyed by the bubbles in and of themselves.

Second, give yourself a hint. The size of the first bubble corresponds to 363 suits. What does that tell you about the second bubble? Unfortunately, the answer is still nothing.

Third, give yourself two hints. The second bubble from the left has size 311. Now try to estimate the size of the rightmost bubble given those two pieces of data. This exercise is still extremely taxing.

Thus, the conclusion about bubble charts is:

That is to say, it fails the self-sufficiency test (link). The chart cannot exist without the data labels. The graphical elements do not provide any additional value.

You can follow this conversation by subscribing to the comment feed for this post.

Certainly a terrible use of bubbles here, but I think there are cases where using circle area to communicate quantity is useful. It's nice if you have values that vary by too large a factor to express using bars. Also, it can be used to compare a subset to a whole. And on occasion you might have a gigantic number that you want to let run off the chart area, and the curve of the circle's edge allows the reader to complete the circle. I think the recent carbon bomb graphic is a decent expression of the first couple ideas:

http://www.vancouverobserver.com/blogs/climatesnapshot/2012/03/08/confused-tar-sands-climate-threat-take-look

Agreed that bubbles are a bad design choice for displaying a single list of values. But a common uses of bubbles is to display positions along with other values represented as size and/or color. This is a much more appropriate use case and not completely hopeless. In this case you're trying to visually represent the trend in 3-4 variables simultaneously. Yes, bubble size is more difficult to measure than bar lengths are, but combining the values into a single plot is the point. You're not going to get the same thing by showing me 4 bar charts side-by-side and asking me to blend the information from multiple charts.

I think bubbles can be a nice sort of lagniappe on a scatter chart. THe idea being that the scatter chart shows the two important dimensions, and the bubbles replace the markers to become a supplementary third dimension, not essential but nice to have.

Because it's not essential, it's okay to have it lower down the Cleveland hierarchy, and if asked why you didn't use a better channel, you can answer "because I already used up the better channels on the vital dimensions".

I agree with others; you have chosen pointless one-dimensional bubble charts. A bubble chart is very useful when you have basically X-Y data, but you want to show the relative importance of different points (e.g. size of a market).

I think the graphic overall is nice. How would you have displayed the same information?

The comments to this entry are closed.