## Doctoring charts

##### Dec 25, 2007

Reader Chris P. alerted us to a fascinating post from Errol Morris' blog, which presents results in graphical form from a readers' poll related to this other post.  This other post deals with a pair of photographs taken during wartime, previously discussed by Susan Sontag and others.  Sontag believed the pair documented a before-and-after setting: it was alleged that the photojournalist shifted some cannon balls from their natural position between takes.

Morris polled his readers asking them in which order they thought the photos were taken ("on before off", "off before on", "undecided"), and which factors were used to make the decision.  He presented results in two formats, first plotting frequencies in bar charts and then plotting proportions in pie charts.  He preferred the pie chart construct.

Most here would share Chris' reaction: "Oh my.  What people do with Excel."

The biggest problem with these pie charts is the unreasonable baseline.  This is one of those polls that allow respondents to pick any number of factors and clearly, the pie chart creator used the 1,151 responses as the baseline, as opposed to 910 people who voted.  Consider these two statements:

• 52% of respondents who decided "on before off" listed "sun shadow" as a decision factor
• 30% of the decision factors submitted by respondents who decided "on before off" were "sun shadow"

It is tough to figure out what the second statement means.  It is as if the respondent who selects more than one factors gets more than one votes in the final tally.  To put it differently, the 30% is meaningless unless one also knows how many decision factors were selected by each respondent, on average and in distribution.  The 52% is independent of such consideration.

Combining the data given in the bar charts and pie charts, one discovered that 469 out of 910 respondents could not decide which photo was taken before the other; besides, these respondents on average expressed 0.9 opinions on the decision factors whereas the respondents who made a decision expressed 1.6 opinions.

A simple illustration to show the key decision variables by type of respondents is shown below.  From this chart, one sees that the number and position of the cannon balls were crucial to at least 50% of those who came to a conclusion.  Sun shadow were much more important to those who decided "on before off" while those who decided "off before on" noticed character artistic, shelling and rocks.  Most other factors did not differentiate the three groups.

Source: "Not Your Mum's Apple Pie Chart", Errol Morris, Dec 18, 2007.

You can follow this conversation by subscribing to the comment feed for this post.

Worth noting is that the clearer, simpler graph at the bottom is also easy to do in Excel. In many cases, you don't need fancier tools, you need more thought to what you are going to show.

I don't think that a line graph is the right solution, as these are - how do you call it: qualitative? - data, categories. So, no (cor)relation between them, which would justify a line graph. Instead, I would probably just use a simple bar graph.

An optimal solution needs the inadequacy of sun shadow to stand out. Maybe plotting proportion correct against most important factor would work, unfortunately not available.

It could be worse, the original could have used 3-D pie charts.

Stef: I tend to be less dogmatic about using line charts with categorical data than many others. In cases such as this, the lines do not mean anything: they are there to highlight gaps in the data. If the lines were not there, then we would have to use different symbols for each series, or different colors, and it'd be much harder to compare the groups!

Look up profile plot or parallel coordinates plot.

Excel will do this pretty well, enough to see what the pie wants to tell us.
But and this is good, better than excel.

The comments to this entry are closed.