Nice example of histograms
May 30, 2012
The New York Times (link) uses two histograms to show us the geographical distribution of college graduates today compared to 1970. The histograms clearly and forcefully demonstrate two points: the almost three-fold increase in the concentration of college graduates in metropolitan areas, and the wider spread in geographical preference. In other words, we find that the shape of the distribution (in particular, the width) and the mid-point of the distribution have both shifted in those decades.
Readers must be careful about interpreting the colors, which are keyed to relative scales. Every single orange square on the right chart represent a higher percentage of college graduates than the single orange square on the left... this is because of the massive increase in the number of adults with college degrees over this period of time.
I'd suggest two small improvements. Arranging the histograms vertially makes a huge difference:
On the maps, I'd get rid of the gray dots. The point of the maps is to show where the graduates are flocking to and where they are not favoring. The gray dots on the other hand serve mainly as a geographical lesson of where the metropolitan areas are on the U.S. map.
The gray boxes do clutter up the maps, but I would still keep them. It lets me see that the metro area nearest to me has declined, relative to the other metro areas. Without the boxes information is lost because I can't see this from the histograms. Overall, a very nice graphic.
Posted by: HGP | May 31, 2012 at 01:21 PM
I also like that you also dropped the number of metro areas within 5% of the mean on the two graphs in your version. With a growth of college degrees from 12% to 32% I would expect to see the range in the percentage of people with college degrees to change as well.
Posted by: LarryC | May 31, 2012 at 08:25 PM
I'm not sure I agree with what the author is trying to imply. The three-fold increase in metro areas, sure, but the wider spread of geographical preference? I'm not buying it.
For the 1970 data, the 5pt spread represents almost a 42% change from the average. In 2010, the same 5pt spread represents only a 15% change from the average. If we were to apply a 42% threshold for orange/black squares on the 2010 data, it would be +/- 13 points, making the distribution reasonably similar to 1970.
Posted by: Craig Wong | Jul 05, 2012 at 12:43 PM