Embarrassment of Riches
Jul 24, 2005
The computing age has revealed our embarrassing ineptness at handling data richness. Examples abound of data-rich but information-poor charts. A sure symptom is upper-class guilt: I've got so much data ($$$) I know not what to do.
Now, ask yourself what information is conveyed by the following data-rich chart, then compare your thoughts with my comments. (The graphic accompanied an article titled "Ferocious Heat Maintains Grip Across the West") (temperatures in F)
- This is an example of what Tufte calls "small multiples", a series of charts with a basic design replicated over changes on some dimension (here, week of the month). For "small multiples" to succeed, we must feel at home with the basic design. Can we make sense of the scatter of dots (cities)?
- You certainly noted that most points sit on the left (west). If that were the whole point, why show city-level data? Did you wonder how many cities reached over 100 F? This information could have been denoted on the map
- The state boundaries led us to wonder which states were most affected. Unfortunately, they presume we know the U.S. map well, which we don't
- Next, how should we relate the first chart to the second, and to the third? The dots appeared to have shifted north-westerly, and then towards the middle. But the article focused on how the heat wave affected the West
- Perhaps the point was not physical movement but the number of cities/states affected. Then show us the counts because the naked eye cannot judge the relative size of scatters
- Further, why show the three weeks of July, with "week" starting on Sunday and ending on Saturday? This highlights only those changes occurring between Saturday and Sunday. In the article, a meterologist identified July 12 as the start of the heat wave. Two charts showing before and after July 12 would have confirmed the existence/effect of the heat wave
- Instead of comparing before/after July 12, we can also compare July 12-21, 2005 with July 12-21, 2004.
When dealing with rich data, be picky. Know your message before plotting. Add details only if the reader can make sense of them.
In this example, we still aren't sure what the key message was, and including city-level detail without giving us counts and state-level detail without annotation frustrates rather than illuminate!
If you know where I can get temperature data, please leave a comment. I'd like to create a junkart version.
Reference: "Ferocious Heat Maintains Grip Across the West", New York Times, July 23 2005.
Comments
You can follow this conversation by subscribing to the comment feed for this post.