A chart that stops the story-telling impetus
Apr 10, 2012
We all like to tell stories. One device that has produced a lot of stories, and provoked much imagination is the dual-axis plot showing two time series. Is there a correlation or is there not? Unfortunately, most of these stories are false.
Looking at the following chart (link) showing the home sales and median home price in Claremont over the last six years, one gets the sense that the two variables move in tandem, kind of. Both time series appear to reach a peak in 2006 and a trough in 2011. In 2010, both series seem to be levelling off.
When the designer places two series on the same chart, he or she is implicitly saying: there is an interesting relationship between these two data sets.
But this is not always the case. Two data sets may have little to do with each other. This is especially true if each data set shows high variability over time as in here.
***
Below is another view of the same data. In order to visualize any year-to-year effect or quarterly effect, I split the data along those dimensions. The year-to-year effect is quite strong although there isn't any interesting pattern. The quarterly effect is not so strong, and as the directions of the paths indicate, this effect is not consistent from year to year.
The scales on each axis are "standardized" meaning 0 is the average value, 1 is one standard deviation above the average, etc. Movements of 1 to 2 standard deviations are not unusual so one can see that almost all values on the chart are within 2 SD.
There just doesn't seem to be a compelling story here. This chart taxes our imagination.
PS. In case you're wondering, this chart is made using Graph Builder in JMP. (except for the arrows) I also wish JMP would allow me to use 1,2,3,4 (column data) as my plot objects instead of the standard dots and crosses, etc.
[4/11/2012: Thanks to Ken L. for submitting this chart. Also, Rob Simmon on Twitter points out that the house price data should be inflation-adjusted.]
To use column data as plotting symbols in JMP , try the following: create a row states column using the formula given below where Quarters is the column of quarters.
Marker State(Hex To Number(Char To Hex(:Quarters)))
Apply these to the data table and you'll be plotting the values...
Posted by: Colin Lewis | Apr 11, 2012 at 09:39 AM
This is one of the problems with statistics. It can be blended or manipulated. The question then arises what statistics do we believe?
Posted by: Mike | Oct 20, 2012 at 02:46 PM