Supplemental reading
The case of the shrinking mall

The trouble with two lines

It's never a good idea to put two scales on one chart, and this is another example of what not to do:

Ir_newspaper

The ugliest part of this chart are the duelling gridlines. Because neither axis starts at zero, it is difficult to access whether the number of newspapers was declining at a faster pace than the circulation was. Also, line charts would be better able to trace the evolution over time. The interspersed blue and red columns interfere with each other. Note that the designer lessened our pain by plotting every other year, thus halving the number of columns.

The junkart version also puts two data series on the same chart but on the same
scale. Instead of plotting the raw data, we plot indices, with 2008 as 100. This reveals a pattern that was not apparent in the original chart. There appeared to have been four periods of evolution: up till 1980, both the number of newspapers and total circulation were at a plateau; from 1980 to 1990, the circulation stayed stable while the number of papers dropped drastically, indicating perhaps consolidation; then from 1990 to 2003, both series declined at roughly equal rates; and finally, the bottom dropped off the circulation from 2003.


Redo_newspaper


See our previous discussion of dual axes here, here, and here


Reference: "Channel Shift: Online Circulars: the first step by retailers toward web-to-store harmony? ", Internet Retailer (print), May 2009. Data from the Newspapers Association of America.


Bonus 1: Thinking about the column interspersing trick some more, I realize that it is, em, possible, em, that one series was plotted for odd years and the other series plotted for even years!

Bonus 2: Here is the requested scatter plot (still indiced):

Redo_newspaper2



Comments

Jorge Camoes

When I see charts like this I can't help but think the designer did it on purpose. It's such a combination of bad options that not even the defaults can explain it.

Obviously your chart tells the right story. I also like the idea of using 2008 as the reference point instead of 1965. This solves the problem of where to break the axis.

derek

The original graph looks like it was done by someone clever enough to know you can use linear regression to define a pair of matching scales, so the two lines give the impression of strongly correlating, but not wise enough to notice that it's the deviations from linear correlation that are interesting in this case.

Seeing as you've got the data there, Kaiser, what does a scatter graph look like, with a single line labeled with the years wandering in a two-axis space?

Matt

I'd do two charts, positioned on top of each other, so time scales line up. First shows average circulation, second shows number of newspapers.

SB

I'm curious to know why you chose the end points to have the same index value, instead of the starting, which I believe is more common.

Kaiser

SB: Good question. In any historical time series, the starting point is arbitrary and so picking 1968 as the base index is arbitrary. However, picking 1968 as the base versus 1978 as the base will lead to two distinct curves. Indeed, the data series stretched back another 20 years and I'm not sure why the chart designer picked 1968 as the starting year.

The end point of a time series though is the current time and to me, it is less arbitrary and the chart could be cut off vertically at any point without any problems.

Joyce

I'm using a comment as a "suggest" form here. The back of today's New York Times' Week in Review section devotes half of its space to a lame infographic that wastes space and has a major error (which has been corrected lazily in the version now online at http://www.nytimes.com/interactive/2008/10/14/opinion/20090531_OPCHART.html )

The chart shows the recent decline (or in some cases, rise) of retail sales at 27 common mall chains. My main objection is that the top half of the chart is useless. Yes, it provides a baseline from which shrinkage in area (visual metaphor of income = floor space) in the bottom half can represent the relative declines in sales, but this is redundantly handled better by color. The only part of the chart that I got any information from was the bottom half, and it took me a while before I figured out why the top half was even there.

Meanwhile, the error in the printed version is that the +5-10% stores were colored light green while the +0-5% store (Burger King) was colored dark green. It should have been vice-versa. The online version simply swapped the colors in the legend, rather than on the map itself, which works logically but begs the question: why do you have dark red at one end of the spectrum and light green at the other, with dark green in the middle?

Thank you for listening-- just blowing off a little steam here. It's a lot of wasted space and I'll bet the New York Times paid a lot for it.

Sanford Silverburg

How do you employ two indexes on the Y-axis in a line chart, if you have two concentrations of data in a vector?

Dan Loeb

You should connect the dots in the scatter plot to get a sense of the time order of the observations. (Put an arrow head at the beginning at end of the line to fix the direction of the broken line.)

The comments to this entry are closed.