Andrew Gelman has posted a few times recently on graphics-related topics. Here are the links, and my reaction:
- He and I both think line charts are under-valued. Some people really, really hate using line charts when the horizontal axis consists of categorical data; as I've explained repeatedly (see posts on profile charts), by drawing lines to connect these categories, all I'm doing is to expose our eye movements while reading the bar charts that are often the default option for such data.
- Regarding a very "ugly" chart on factors affecting military spending, Gelman wrote the following spot-on sentences:
- Just as a lot of writing is done by people without good command of the tools of the written language, so are many graphs made by people who can only clumsily handle the tools of graphics. The problem is made worse, I believe, because I don't think the creators of the graph thought hard about what their goals were.
- That last point is exactly why I placed at the top of the Trifecta checkup the question of figuring out what is the key question the chart is supposed to address.
- Seems to me the above chart presents in a complicated fashion a simplistic model of military spending share: military spend = military share of GDP x GDP, therefore relative military spend increases if either relative GDP increases or relative military share of GDP increases (or both). So, in each period, all we need to know is whether the US has increased/decreased its military share of GDP relative to the rest of the world, and whether the US has increased/decreased its GDP relative to the rest of the world. End of story.
- Some work on visually displaying telephone call data. Gelman's correspondent nominated this and another chart printed in the NYT as worst of the year. Chris Volinsky disagrees and points us to a nice article. The map shown here is definitely not close to being worst of the year. The other chart, with a lot of lines, is pretty bad - and raises the question I asked the other day: what makes a "pretty" chart?
- Regarding the AT&T analysis, I have a few questions for the researchers: how representative is AT&T data especially at county level? do we have to worry about nonrandom missing data? Also, how should one interpret the large swath of the Midwest which had the "background color"? Is it that there weren't sufficient data or that the data showed that all of those states belong together in one super-cluster? Finally, how does a shift in the "similarity" metric change the look of the map?