A not-so-satisfying rose
Sep 02, 2015
At the conference in Bavaria, Jay Emerson asked participants to provide comments on the data visualization of the 2014 Environmental Performance Index (link). We looked at the country profiles in particular. Here is one for Singapore:
The main object of interest here is the "rose chart." To understand it, we need to know the methodology behind the index. The index is a weighted average of nine sub-indices, as shown in the table at the bottom. In many cases, the sub-index is itself an average of sub-sub-indices. These lower-level indices measure the distance between a country's performance and some target performance, typically set at the international level. But those distances are converted into a scale between 0 and 100 so the country with a score of zero did the worst in terms of meeting the target while the country with 100 did the best.
In the rose chart, the circle is divided evenly into nine sectors, each representing a sub-index. The data are encoded in the radius of the sectors. Colors map to the sub-index, and the legend is provided in two ways: a hover-over on the Web, and the table below.
Here is the equation that connects the data (EPI) to the area of the sectors:
There are a number of issues with this representation. First, because of the squaring of the EPI, the area is distorted. If one country is twice the EPI of another, the area is four times as large. Another way to see this is to notice that as the EPI increases, the curved edge of the sector moves outwards, tracing a larger circumference.
Another issue is the one-ninth factor, which implies that each of those nine sub-indices are equally important. The diagram below shows that interpretation to be incorrect. (The nine sub-indices are shown in the second layer from the outside in.)
A third issue is illustrated in the Singapore rose. Notice from the table below that Singapore scored zero on Fisheries. But in the rose, Fisheries has a non-zero area. Think of this practice as coring an apple. The middle circle of radius k should be ignored. If the sector that has the color of Fisheries has zero area, then the entire red circle shown below should have zero area.
With these three adjustments, the encoding formula becomes rather more complicated:
where x depends on the weight of the sub-index, and k is the radius of the sector that represents value zero.
***
The rose/radar/spider type charts are more useful when placed side by side to compare countries. But even then, this chart form doesn't work well for this dataset. This is because the spacing of countries within each sub-index is not uniform.
The site has a visualization of the distribution of sub-index scores by issue:
We can see that in cases of water resources, most countries are not doing very well at all. In terms of air quality, most countries except for those in the right tail have performed quite well. It is hard to interpret the indices unless one has an idea of the full distribution.
***
Finally, one wrinkle that the EPI people did makes me happy. They have created PDF and images of their data visualization so it is quite easy to save and keep some of this work. All too often, browser-based technologies create visualization that can't be saved.
I think it's even weirder than you described. Singapore appears to have missing data for forestry and so the rose chart is eight parts instead of nine. Other countries I tried did indeed have nine. As if it wasn't hard enough to compare countries using this graphic in the first place... :)
Do you have thoughts on color selection? Visually I find similar shades of color, while clearly different when adjacent to each other, are much more difficult for me to match up from the legend to the graphic. It took me a lot more time than I think it should have to figure out that forestry was missing. Is that just me?
Posted by: Adam Schwartz | Sep 04, 2015 at 12:49 PM