## A matter of timing 2

##### May 12, 2008

Our last post generated much discussion around double axes.  In this post, we take up Michael's suggestion of a scatter plot, and several suggestions to retain the original units.

The scatter plot in this case did not provide any insight, unfortunately.  See below.  It just highlighted the jerkiness in the data so we ended with much zig-zagging.

Retaining the original units is not advisable because those units were not comparable.  In the following caricature, we show how to shape the axis to tell any story we want.

Panel plots are slightly better insofar as such mischief could be spotted by the amount of white space.

Another way to make the two data series comparable is to plot percentage change from year to year.  This is similar to indexing, just the difference between annual change and cumulative change.

You can follow this conversation by subscribing to the comment feed for this post.

The problem with periodic percentage change is that the overall (cumulative) trends in the data are washed out by the periodic fluctuations.

I had to interpolate data values from your plots, so my data might be off.

Connecting the data points highlights the zig-zagging. A standard scatter plot and the correlation of -0.62 seem to indicate a fairly strong negative correlation between volume and crashes for this type of data.

I agree with Michael - why join the dots in the scatter plot?

Seems to me the scatter plot is both the correct and the most visually effective chart in this case.

I'm agreeing with Michael in the previous post's comments. Why not plot rates (crashes/volume)? Volume is an imperfect, but reasonable measure of exposure to the risk of a bicycle crash/fatality.

Yes, I like the idea to plot volume vs. crashes/volume. Here's the scatter plot, r = -0.84.

It's a matter of what question you're trying to answer. If we are looking for a functional relationship between accidents and volume, then yes, a scatter plot without lines works better. In this case, and in most social science situations, the time dimension is important. The line serves to expose any trends; in this case, it's hard to tell. Plus, the time series is too short.

There's a reason to join the time points in the scatterplot. It does provide insight.

Note that the points primarily go counter-clockwise.

As Krider, et al. note (Marketing Science, 2005) the counter-clockwise pattern is associated with Y causing X -- or, more conservatively, evidence that it's not X [bike volume] causing Y [fewer bike accidents].

---
As a frequent bicyclist myself, I will make two non-graphical comments:

(1) Overall bicycle safety is likely to be improved with additional bicyclist volume -- vehicles are more likely to be aware of bicyclists.

(2) But, increased volume can lead to a bunch of newbys who THINK they know how to ride in traffic and end up running lights, passing trucks on the right and other forms of unsafe behavior.

The comments to this entry are closed.