Stefan pointed us to his work for the UN GEO (United Nations Global Environment Outlook) data portal. This set of information posters highlights a vexing issue that crops up on Junk Charts from time to time, that is, the proper balance between information and entertainment value of data displays. While this blog concerns itself primarily with the former, it does not mean that we are blind to the flashier side of the enterprise.
Let's take Stefan's recycling spiral chart as an example. One must admit that visually this presentation is more appealing than either a data table or a set of bar charts. The reader can obtain the primary piece of information, which is the ranking of different countries in terms of the proportion of collected waste that is recycled.
And if the reader is curious enough, the chart also provides the data on the per-capita amount of waste collected in each of these countries. (Like the table and bar chart, this display also has the problem that it is one-dimensional, thus the countries can be sorted by proportion of recycling but then the waste collected data will be out of order.)
For those readers who would like to understand the data better, they would want to know some of the following:
- Is there a relationship between amount of waste collected and amount of waste recycled?
- Are there differences in culture resulting in different recycling rates?
- Is the level of development of a country predictive of its recycling rate?
- Why are some countries recycling more of its waste, and others less?
To address these types of questions, one can start with the following scatter plot.
With the exception of South Korea, there is a general pattern of positive correlation: the more waste collected per capita, the larger proportion of such waste recycled. Any dots that are not in the bottom left or top right quadrant are exceptions to the rule. These countries are labeled in red or blue, the former indicating that the amount of collection is above average while the rate of recycling is below average.
Because there is sampling error, dots that are close to the average dot (the center of this scatter plot) are probably just average. Roughly speaking, dots in the gray circle are close enough to the center that I would not consider them exceptional cases. That leaves Spain and Iceland in the red corner, and South Korea in the blue corner. If both data series are considered together, these three countries should merit attention; if only the proportion of recycling is considered, then one would pay attention to Italy, Turkey and Slovak Republic on the lower end and South Korea on the high end.
Scatter plots are very versatile. The following one explores the issue of development level. Surprisingly, the level of recycling seems to have little to do with development; the countries are quite widely scattered.
Technical note: The data on both axes are expressed in "standardized" units. So the zeroes represent the average per-capita waste collected, and the average proportion of waste recycled (only of those countries depicted in the original chart). +1 indicates an amount that is one standard deviation above the average. Think of "standardized units" as measuring how extreme is a particular country with respect to the average.