Apr 09, 2008

An embarrassment

I find it embarrassing for the Economist to print an article like this one.  (Do they have a statistics editor?)

Econ_smoking

The subtitle asserting "causality" is offensive.  It is alleged that smoking bans in bars have "caused" more road accidents because people are forced to drive longer distances to find those bars that still allow smoking.

To assert causality so starkly for an undesigned observational study is unprofessional.  I doubt that the authors of the study they cited even went so far.  At best, they probably found a correlation.

Another problem is the practical significance of the finding.  There is a 13% increase in fatal accident rate in a "typical county containing 680,000 people".  There are two problems with this statement:

  • When I check the Census data, there are only about 85 counties in the entire U.S. with at least 680,000 people.  What do they mean by "typical"?
  • 13% is said to be an increment of 2.5 fatal accidents, presumably per year.  The crane accident in Manhattan a few weeks ago killed at least five people.  I just don't believe that one can prove definitively that such a tiny difference is not due to chance so even the correlation, let alone the causality, is suspect.

It appears that the paper is locked up in pre-publication.  If you have seen it, let us know if the authors actually asserted causality.

Reference: "Unlucky Strikes", The Economist, April 3 2008.

Mar 05, 2008

Mid-week entertainment: Pity grapefruit

Courtesy of Derek.  Hope for the scatter plot?

Grapefruit_scatter

Original link here

Sep 04, 2007

Read fast, pay the price

At first, this looks like a decent chart despite the donut construct, which I cannot stand (but the Economist loves).

Rockstars

The accompanying text proclaimed: "Rock stars are famous for excess, and some pay the price".  The rest of the paragraph points out drug- and alcohol-related deaths, plus deaths due to "unhealthy lifestyles", which apparently include cancer and cardiovascular disease.

There is a gaping hole between what's on the chart and what's in the text.  They just talk past each other.

  • The chart invites us to compare the European experience to the American experience. Each donut presents the proportion of total deaths by causes of death. The top donut presents American rock-star deaths, the bottom European ones. But this comparison has zilch to do with the key point, which is how rock stars are different from the rest of us.  The chart tells us nothing about the rest of us.  The 20% death by cancer would be entirely unremarkable if 20% of non-rock-star deaths also were attributed to cancer!
  • We must also bear in mind that the base populations are rock stars who died young. This is a very specific demographic segment, and so the only valid point of reference are people who died young.  If we think along those lines, then among unmusical people, if they died young, what might have been the causes of death?  Drugs? Alcohol?  Accidents?  Suicide?  You bet.  I am not sure who is the authoritative source of such data but the CDC reported that among Americans aged 15-34 who died, the leading causes were "unintentional injury", suicides, homicides, cancer and heart disease.  Not much different from the above list...
  • The deaths depicted in the two donuts totaled fewer than 100, and yet percentages are given to one decimal place.  This creates a false sense of precision not justified by the sample size.
  • The deaths occurred over about 50 years.  It is very likely that the causes of premature death have shifted during this time span, making an aggregate analysis questionable.

Charting is much more than just aesthetics.  Some basic statistical common sense goes a long way.  This was observed long ago by Huff.

Source: "Rock stars: live fast, die young", Economist, Sept 4 2007.

Jul 16, 2007

Gauging the water level

Nyt_waterThis set of charts covered the back page of one of New York Times' sections this weekend.

Regular readers will share my enthusiasm for the top chart.  It makes a clear, cogent case to support the article's thesis concerning the rise of bottled water.  Various renditions of this type of chart have appeared here, for example.

Specifically, the smart use of color to cluster the line objects helps interpret the trends.  Blue sets out the two primary interests.  (It's a mystery to me why the gray lines were separated into darker and lighter hues.)

The twenty-year horizon used is another nice touch. I'd remove the gridlines although they aren't too distracting here.

Sadly, the second graphic does not meet the high standard of the first.  The biggest problem concerns the red rectangle, purportedly showing how much of the bottled water was imported.  The choice of differently-sized bottles as objects makes it impossible to gauge what proportion of the total was imported.  If the rectangle was placed over 1-litre bottles instead, it would look smaller.

Source: "A Battle Between the Bottle and the Faucet", New York Times, July 15, 2007.

Jun 06, 2007

Mid-week entertainment: creme fraiche

From Forsooth! on RSS News, June 2007

Sainsbury

Nov 28, 2006

Dropped, just like that

Quakeradsm_1 Frank W. sent in a timely reminder of the start-at-zero rule.  This ad from Quaker Oats pitches the impossible: the smart consumer will never believe that cholesterol levels can be dropped just like that!  According to Frank's measurement, the column heights plunged 77% from Week 1 to Week 4 in this chart.

In fact, if the vertical axis had started from 0, then the drop would more appropriately appear to be 5%.  Now, even that would have been a miracle, in my opinion.

Thus, I would like to know what is a "point" of cholesterol, and what do they mean by "representative" drop.  I suppose they are asking me to call that number.

My previous posts (with commentary from readers) about starting at zero can be found here and here.