« January 2010 | Main | March 2010 »

Convention and function

Over at the Social Science Statistics blog, Deidre showcased a set of maps that show the evolution of income inequality (as measured by the Gini coefficient) by state in the United States over the last four decades.  This presentation reminds us of the CDC obesity maps, the only difference being how the CDC packaged the maps in a nice little animated jpeg movie. We loved the chart then.

Sss_gini

The most important goal of a chart designer is to match form and function. What is the best way to present the data that explains the message the designer has gleaned from the data? Certain conventions have been developed in this profession over the years, following which the designer can produce graphics of adequate value. For example, the preference for bar charts over pie charts, starting bar charts at zero, and so on.

On the other hand, conventional thinking can sometimes hinder us. When faced with geographical data, the first thing to come to mind is the geographical graph paper (i.e. the map); but the map is a highly inflexible canvass, and we should always consider non-geographical presentations as well.

In the income inequality example, Deidre's maps are great in displaying the following:

  • An overall increase in Gini levels across the entire country (the yellow to orange transition)
  • The higher inequality in the South throughout this period (by 2007, these states are still darker red than the rest of the countries)
But the readers will have other questions that cannot be easily answered by these maps:
  • What's the profile of growth in inequality? (the color scale does not convey this information well as compared to, say, a line chart)
  • Are there groupings of states (outside of proximity/regions) that have experienced similar profiles of growth over this period?
  • How to identify specific subsets of states, e.g. those that started with the least inequality and ended with the most, those that experienced the smallest amount of change over this period, etc.?

The following panel chart answers the above questions much better but at the expense of the geographical graphical paper.

Redo_gini2

A few words are in order to explain this chart.  Each panel is a line chart of the growth in inequality (gini) over time for a specific state.  States that have a similar profile are grouped together. On this display, the slopes of the lines tell us quickly which states have experienced the greatest growth in inequality (California, New York, Connecticut, DC), and those with the slowest growth (Alaska, Dakotas, etc.).  

I used a k-means clustering algorithm to create six groups of states -- DC, which has the highest inequality by far, is a cluster by itself.  Within each cluster of states, the panels are arranged in alphabetical order.  I am not particularly happy with the cluster analysis result - I tried different algorithms but did not find any better patterns in the time I spent with this dataset.

This type of display is very flexible. One could group the state panels by whatever criterion one desires. If one wants to look at regional differences, for example, the states could be grouped by region.

Is the panel chart always superior to the maps?  No.  It depends.  The point is that one should always check out different displays before settling onto maps.

Related older posts here.


Last call

A small makeover of the blog is being planned in the near future, which among other things will involve retiring some features. As a courtesy to readers, I am providing the list of features I consider dispensable - please speak up if you like to save any of these:

  • Trackbacks - does anyone use these anymore?
  • Calendar view
  • Digg this - hot then, not so hot now?
  • Monthly archives - I'd like the tag cloud to be the main conduit to old posts; however, I know that the really old posts were not tagged properly so I may or may not remove these
  • Favorite - I like this function; I just don't like the fact that Typepad makes you register to use it
  • Reblog - seems like an interesting, new idea, premature?
  • Add to Typepad People List - are there any Typepad People out there?
  • Add to Google/Bloglines/My Yahoo/Newsgator - did you subscribe by pressing on these buttons or other means?

The following are saved:

  • Monthly archives