Rising bankruptcies
Tag clouds are histograms

Rising bankruptcies and home prices

Steve has kindly plotted house price movements on a map so we can compare that with the bankruptcy growth map.  Recall the observation that "where home prices rise steeply, bankruptcies fall".  Notice that Steve reversed the color scheme so that blue indicates low bankruptcy growth and high home price growth.  This helps us visually inspect the two maps (nice touch!).


  • The assertion that low bankruptcy growth is associated with high home price growth makes sense only in California, the Eastern Seaboard and Florida
  • In middle America, even though home prices did not rise by much (and we don't know how much "much" is without the legend), there exist many pockets of high bankruptcy growth counties
  • Whether those pockets all constitute the "small-sample-size" regions marked out by the proviso on the original map is unclear

This article raises the issue of association versus causation.  One might be tempted to conclude that by creating conditions for a rising real estate market, a county government can hope to control the growth in bankruptcies.  Doing so is to confuse causation with association. 

We already noticed that both house price and bankruptcy growth are related to geography, exposing the familiar coastal/middle or East/West/Central distinctions.  Because so many metrics are correlated with such geographical segmentation, it is very difficult to argue that home price growth is the cause and bankruptcy growth is the effect.  This is particularly so because we don't have a controlled experiment, only an observational study.  Mahalanobis has written about "latent variables" before; those variables you don't include in the study can well be more important.  Elsewhere, David Freedman has written much on causality from a statistician's perspective.

To answer Steve's question, the reason why I would like to see population density added to the plot is that as depicted, the areas of colors are proportional to the map areas (which because of projection are not even proportional to real physical areas) but the better index should be population density rather than map area.  I was thinking along the lines of a cartogram but I don't know how to create one.  It's always a challenge how you put the pieces together now that they are scaled and no longer map-sized.


Feed You can follow this conversation by subscribing to the comment feed for this post.

Steve Citron-Pousty

On closer reading of this analysis I should clarify my picture. This is median housing prices in 1990, not the growth in housing prices. Give me a day or two to get the data together. I also promise to give you a legend this time.

I don't think I can do a cartogram, and I know you don't like graduated circles to denote changes in numbers. Give me some time to think about how to visualize that one with the software I have at my disposal. Do you have any idea where I can get the bankruptcy data for counties, that would help. I could probably do a 3d plot with elevation being density of people and coloring based upon bankruptcy.

Andrew Gelman

I'd prefer a scatterplot. Separate scatterplots for each region of the country, perhaps.


Scatter plot is a great idea. There is a tendency to put geographical data on maps by default but sometimes the message is not really in the geography as in this case. By using colors and symbols, it's easy enough to see regional effects anyway.

How would you deal with the issue of the "small sample size" counties though? Those data points are inherently less reliable. Grayscale proportional to sample size, perhaps?

Andrew Gelman

Grayscale could work; i've never experimented with this. The gov't people who make cancer maps combine the smallest counties to get to a minimal sample size, then use shading to distinguish counties (or combined counties) that statistically-significantly differ from the U.S. average. Not that these are perfect solutions--just some more ideas that are out there. The more access we all have to software to make these scatterplots and maps, the better they'll become. I just think it's too bad that it's so easy for people to make maps with shading, since it inevitably overemphasizes large-land-area-counties/states.


Well put! The underlying problem is skewed population density; if the density weren't much higher along the coastal and urban areas than the suburban/rural areas, then it'd not have been a big deal.

You also brought up nicely the point that the map is not the only way to present state/county type data. Often, chaining ourselves to geography severely limits the value of the data graphic.

The comments to this entry are closed.