Why were they laughing?
A data mess outduels the pie-chart disaster for our attention

Little orange circles spell trouble

Economists Banerjee and Duflo are celebrated for their work on global poverty but they won't be winning awards for the graphics on the website that supports their book "Poor Economics" (link). Thanks to a reader  for the pointer.

Here is one view ("radial") of the data:


Here is the so-called linear view of the same data:


And here is what the data really look like:



The linear view contains a host of misleading features. The length of the row of bubbles does not indicate how many total hours the individual spends doing the selected chores, not even directionally. Instead, the length is a proxy for the number of bubbles per person, which is a measure of the variety of chores - the more chores, the longer the chain of bubbles.

And then, you ask where the color legend is. This information is hidden in the mouse-over effect, or in the drop-down menu by clicking "compare daily activities". It's a bit of wasted energy to have to press on various bubbles in order to understand which color is mapped to which chore, isn't it?

The chart also contains a brainteaser. What is the logic behind the order of the individuals?

And finally, what to make of the little orange circles dancing around the chart? (They also decorate the radial view.) Go to the page and try clicking.


Our version tries to do "less is more". One of the tricky features of this dataset is the profusion of little categories capturing daily activities that occupy 0.5 to 1 hour of someone's time. Instead of printing every activity, I chose to bundle all activities requiring less than 1 hour into a single "Others" category.

The "total" category is  a reference level. One can also choose to print white boxes around every single bar and eliminate that row.





Rick Wicklin

It looks like the Totals are about 24 hours, so no info there. The purpose of that field (I suppose) is to make sure each column has the same uniform scale. But 1-2 hours out of 24 is so small that it's hard to discern differences between people and activities. How about letting the scale for each column be max(sleep) ~ 10 hours? Then more of the horizontal space can be used to show the data.

