Chart cleanup
Trying too hard

Lunar eclipse

Todd B. sent me this pie chart, with a note: "Do the areas in the pie chart represent the numbers?"


The short answer is NO. 

It's also not so simple to figure out the areas of crescents.  The purple area looks tiny compared to the dark green region.  If shown this chart, we get the impression that  Microsoft's intention to absorb Yahoo! will not vastly expand the number of unique visitors to its properties because so many of their current users overlap.

The following is a bar chart representation of the same data.  Redo_overlapThe combined entity will have 31% more users than what Microsoft has right now.  Not a bad growth rate for a mature business!  The author of the original post calculated that Microsoft would in effect be paying about $1000 each to acquire these new users. 

Perhaps the most important question is how one values a "unique visitor".  Have anyone seen any sophisticated analysis on this topic?



Feed You can follow this conversation by subscribing to the comment feed for this post.


Why are you calling it a pie chart? It's a Venn diagram.

Tufte famously answered his own question, "What's worse than a pie chart?", with "Multiple pie charts", but another answer might be "A Venn diagram." Where a pie chart is bad at representing quantitative values by relative area, for a Venn diagram, it's frequently impossible, even in theory. I can do the arithmetic for the overlap of two circles, provided I'm allowed to assign the two circles different radii. I have suspected it's possible to represent the intersections of three circles quantitatively, provided they're allowed to be ellipses, and four circles provided they're allowed to deviate from symmetry to become a sort of pear shape, but that's as far as it goes, and I wouldn't like to even begin to work out the math.

And yet people very frequently arrive in the Excel charting newsgroup on Usenet asking where the option in Excel is for multiple overlapping circles with rigorously accurate areas, as if it's the most natural thing in the world. But a Venn diagram can't be a display of quantitative information: it's really a sort of table, like a grid, or a tree-shaped org chart. If you want the numbers, you have to write them in the cells and read them as numbers.

Jon Peltier

Derek - In which case a table is probably better, because it's simpler and the user isn't distracted by trying to over-interpret the overlapping shapes.


Your version does not show the overlap, or am I missing something?

Perhaps extend the blue bar down by 96.6 million?


derek: good point. I was following the reader's lead calling it a pie chart. That would be a stretch.

A Venn diagram is great for illustrating concepts but not data.

Bob: for making the original author's point, it's not needed to differentiate between the overlap and the customers who are exclusively Microsoft but you're right, my chart does not explicitly label the overlapping area


Derek: I somewhat disagree with the assertion that venn diagrams cannot possibly display data in a meaningful, intelligible way. For instance, I'm interested in plotting the concurrent use of substances. The Bayes theorem gets in the way when trying to explain the findings to stats lay people because of conditional probabilities in the context of with stochastic dependence. Example: Among all alcohol users, 30% smoke. Among smokers, 75% also drink alcohol. Clearly P(A|B) ~= P(B|A). I'm convinced this could be nicely graphed with a venn diagram. The size of the circles/eclipses would be the base rate in the population, the overlap would nicely show that the cigarette circle is almost completely embedded in the alc circle. Vice versa that would not be the case. THe alc circle had only a third covered by the cigs circle (I made up the data to explain). I'm desperately looking for software that can do proportional Venn diagrams. Despite your avwersions, any hints to such software? Google is not of much help. Thanks a bunch, Frederic


In this case a Venn Pie Chart might be as good as the bar graph.
Or, maybe even better.


The comments to this entry are closed.