Any time I see a chart like this where the bottom of the y-axis is not zero, I distrust it.

One needs to be able to see the difference in the final two values with respect to zero. In other words, the opening might be only 0.02% on one graph but 70% on another but look exactly the same due to the select of the y-axis range.

Or chart crashes per 100 bike trips?

The two axis chart has an important advantage of tangibility. Some people find it much easier to trust a chart if they can find tangible numbers such as 6000 bike crashes, as compared to an intangible number such as bike crash index of 120.

Grant, I think in this post all the charts that need to have y-axis beginning from zero actually have it there (the original graph).

The "zero-level" of index charts here is 100, and that's centered in the middle of the graph. Value 0 has no meaning in these charts and displaying it would just be confusing.

I agree with Michael that displaying a metric calculated from these two time series could be a good option, although that hides the interesting info that absolute values of both variables are changing, rather than just one of them.

To avoid the baselining and tangibility issues, one could use a panel chart, where the two series occupy parallel panels in the chart. Each panel has its own scale, without normalizing, so the reader can see actual values, and in separate panels there's no way the lines will cross and lead to spurious conclusions.

Panel charts in Microsoft Excel
http://peltiertech.com/Excel/ChartsHowTo/PanelUnevenScales.html

Jon: or two charts?

How about a scatterplot? The points could be labeled with their year since there are only 8 data points. This makes the conclusion that accidents go down as volume goes up even more clear.

I preferthe panel chart over multiple charts, because the panel keeps the different parts of the chart together in a single object.

I rarely used charts of indexed timeseries, but tried one recently after reading this post and learned that they can be treacherous! With all the media coverage of rising fuel prices, I got hold of some data for Sydney retail petrol prices and wholesale crude oil and gasoline prices. Rather than doing the sensible thing and converting to an equivalent unit (\$ per L), I thought indexed timeseries would be a shortcut. I didn't think it through. The chart showed wholesale prices increasing much more than retail, suggesting that retail prices could increase further. Of course, since retail prices are wholesale + margin, without significant increases in the margin, the retail growth rate should only be a fraction of wholesale and the proper chart in common units showed no divergence. Next time I'll be more careful!

Any thoughts on this chart. Here is what should be a close approximation to the underlying data, although it uses spot crude oil prices and I suspect the chart in question uses futures prices.

These things take advantage of the human psychological weakness for pareidolia, or "seeing the Virgin Mary in a tortilla". Or, a related weakness, which is our story brain, that makes a narrative out of random events, or privileges poor explanations that fit a story, over better explanations that diss the story.

Ironically, a technique which may be thought of as cheating-- finding the least-squares fit between the two curves and adapting the scales to use that-- actually reproduces the "approved" way of demonstrating correlation, which is to find the least squares straight line through a scatter plot of the data. I don't know what to think about that :-)

Kaiser has discussed similar issues in "The eyeball test".

@derek: thanks for the excellent word, pareidolia, I'll have to remember that one. I must admit I am always highly suspicious of these kinds of shifyed/scaled charts although they are extremely popular in finance.

I was too lazy to figure out the defaults and let R figure out the dimensions (poorly); with Jake's suggestions, the new set of charts looked much better.

This is only a preview. Your comment has not yet been posted.

Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

(Name is required. Email address will not be displayed with the comment.)

## NEW BOOTCAMP

See our curriculum, instructors. Apply.
Marketing analytics and data visualization expert. Author and Speaker. Currently at Columbia. See my full bio.

## Book Blog

Graphics design by Amanda Lee