Hanging tough
Apr 12, 2008
Reader Nick B. sent in this example calling it "interesting". The chart tells a compelling story once we figure out what it is. Grasping the tree structure is key.
It illustrates the important idea that averaging sometimes masks variations in the data. For example, while the province of Guerrero scored 78% on literacy, the municipalities within Guerrero had scores ranging from 28% to 90%.
It also shows that the gender gap was larger in lesser Metlatonoc municipality than in more literate Cuautitian.
In addition, it tells us that while Mexico on average measured very well on literacy, subpopulations within Mexico spanned the world's best and worst (from about Mali's level to Italy's).
While I find this chart adequate, the pieces hanging off each other did not seem ideal, especially the two overlapping municipality pieces which were placed next to each other. However, it is tough to come up with an alternative. Here's one attempt; the changes are mild.
I prefer the horizontal orientation.
The branches are emphasized (as opposed to the "T" junction) because that's a key part of the story.
The national level, especially the span between Mali and Italy, is de-emphasized; I treat it as gridlines.
Instead of placing the overlapping pieces next to each other, I let the ranges literally overlap, which serves to stress this feature.
The vertical graphs is more readable. We are talking about a 'level' variable.
Posted by: Oskar Shapley | Apr 13, 2008 at 01:48 PM
There is something I don't like about this representation.
This representation assumes, or at least suggests, that at the finest level there is no group with a highest literacy rate than the males from Cuautitlan, Mexico and none with a lower rate than the females from Metlatonoc, Guerrero.
What if that weren't the case? what if there was a group with a higher literacy rate than the one shown on the chart, although their municipality or province would have a lower score than Cuautitlan or Mexico? or conversely, what if there was a group with a lower score than 20 even though their municipality or province did better than Metlatonoc or Guerrero? in that case the graph wouldn't work or at least it would be misleading. that's only coincidence that the subsets with extreme values happen to be part of subsets with extreme values at a higher level.
Posted by: vozome | Apr 14, 2008 at 05:48 AM
I get the impression that this representation only sets out to illustrate how extreme the disparity is, not to conclusively rule on the absolute extrema. It works for me. Once I got what was going on, I actually prefer the original chart; I found the graphical representation with shading "fanning out" to connect the subsets helped me to visually group the extrema more easily than the junkchart version. Looking only at the municipality column, Acapulco and Metlatonoc clearly belong in the same grouping. The junkchart version requires me to look up to origination point of the connecting lines in the Province row to make the same association. My only change to the original would be to eliminate the horizontal lines connecting the successive subsets, since it is redundant with the shaded areas almost to the point of losing clarity.
Posted by: Mrweatherbee | Apr 14, 2008 at 05:06 PM