Sweet!

Have you tried doing this in Protovis?
http://vis.stanford.edu/protovis/ex/brush.html

Nice post, some points to add:

1. With "only" 11 variables and some dozens of observations the SPLOM does still work reasonable well. For 20 variables and some hundreds of cases this plot will fail.
2. The ellipses help a lot in judging the correlations, but do we need a plot if this is essentially all we look at?
3. Linking cases across the scatterplots will take us to even higher dimensional insights than just 2-d.
4. Is the data available somewhere? I am keen on looking at the data in Mondrian.
5. If there is a geographical reference in the data, i.e., the neighborhoods, we should link the map with the data. This will be far more powerful than any analysis which ignores this aspect.
But the important point is that you actually collected real data and addressed a real problem!

Check out the "ezCor" function in the "ez" package for R. It plots something similar, but with additional features such as univariate densities, correlation coefficients, etc.

regrg

For a really great-looking and versatile scatterplot matrix, check out RegressIt, a free PC Excel add-in: http://regressit.com. Each element in the matrix is a separate native Excel chart, fully labeled and intelligently scaled and already formatted for presentation. It can be further edited with any of the usual charting tools and it can be live-linked to Powerpoint documents. The individual charts may optionally include regression lines and center-of-mass points. Axes are scaled to the minimum and maximum values of the variables, and the chart title includes the correlation and either its square or the slope coefficient. You can produce either a full square matrix or else a column of plots which all have a specified variable on either the X or Y axis (e.g., the dependent variable for a regression model). An example can be found at the bottom of this page: http://regressit.com/descriptive-data-analysis.html. RegressIt also produces many other well-designed charts, e.g., parallel time series plots of many variables and 7 different types of charts for regression models.

This is only a preview. Your comment has not yet been posted.

Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

(Name is required. Email address will not be displayed with the comment.)

## NEW BOOTCAMP

See our curriculum, instructors. Apply.
Marketing analytics and data visualization expert. Author and Speaker. Currently at Columbia. See my full bio.

## Book Blog

Graphics design by Amanda Lee