« Supplemental reading | Main | The case of the shrinking mall »


Jorge Camoes

When I see charts like this I can't help but think the designer did it on purpose. It's such a combination of bad options that not even the defaults can explain it.

Obviously your chart tells the right story. I also like the idea of using 2008 as the reference point instead of 1965. This solves the problem of where to break the axis.


The original graph looks like it was done by someone clever enough to know you can use linear regression to define a pair of matching scales, so the two lines give the impression of strongly correlating, but not wise enough to notice that it's the deviations from linear correlation that are interesting in this case.

Seeing as you've got the data there, Kaiser, what does a scatter graph look like, with a single line labeled with the years wandering in a two-axis space?


I'd do two charts, positioned on top of each other, so time scales line up. First shows average circulation, second shows number of newspapers.


I'm curious to know why you chose the end points to have the same index value, instead of the starting, which I believe is more common.


SB: Good question. In any historical time series, the starting point is arbitrary and so picking 1968 as the base index is arbitrary. However, picking 1968 as the base versus 1978 as the base will lead to two distinct curves. Indeed, the data series stretched back another 20 years and I'm not sure why the chart designer picked 1968 as the starting year.

The end point of a time series though is the current time and to me, it is less arbitrary and the chart could be cut off vertically at any point without any problems.


I'm using a comment as a "suggest" form here. The back of today's New York Times' Week in Review section devotes half of its space to a lame infographic that wastes space and has a major error (which has been corrected lazily in the version now online at http://www.nytimes.com/interactive/2008/10/14/opinion/20090531_OPCHART.html )

The chart shows the recent decline (or in some cases, rise) of retail sales at 27 common mall chains. My main objection is that the top half of the chart is useless. Yes, it provides a baseline from which shrinkage in area (visual metaphor of income = floor space) in the bottom half can represent the relative declines in sales, but this is redundantly handled better by color. The only part of the chart that I got any information from was the bottom half, and it took me a while before I figured out why the top half was even there.

Meanwhile, the error in the printed version is that the +5-10% stores were colored light green while the +0-5% store (Burger King) was colored dark green. It should have been vice-versa. The online version simply swapped the colors in the legend, rather than on the map itself, which works logically but begs the question: why do you have dark red at one end of the spectrum and light green at the other, with dark green in the middle?

Thank you for listening-- just blowing off a little steam here. It's a lot of wasted space and I'll bet the New York Times paid a lot for it.

Sanford Silverburg

How do you employ two indexes on the Y-axis in a line chart, if you have two concentrations of data in a vector?

Dan Loeb

You should connect the dots in the scatter plot to get a sense of the time order of the observations. (Put an arrow head at the beginning at end of the line to fix the direction of the broken line.)

The comments to this entry are closed.

Kaiser Fung. Business analytics and data visualization expert. Author and Speaker.
Visit my website. Follow my Twitter. See my articles at Daily Beast, 538, HBR.

See my Youtube and Flickr.

Book Blog

Link to junkcharts

Graphics design by Amanda Lee

The Read

Keep in Touch

follow me on Twitter