Apr 19, 2008

Cram it like Koby

You have to gradually build up your gut by eating larger and larger amounts of food, and then be sure to work it all off so body fat doesn't put a squeeze on the expansion of your stomach in competition  -- Takeru Kobayashi, six-time champion of the Coney Island hot dog eating contest

Kobayashi is a phenom.  He can stuff 60 hot dogs or 100 burgers in ten or twelve minutes and show no consequences.  Ordinary people can't hope to emulate these feats.

Junk Charts sees Kobayashi as a hero; an anti-hero really.  We are ordinary people; we can't hope to cram it like Koby.  A message we keep repeating here is: too much data sinks a chart.

Econ_anglosaxon Not long after this chart showed up in the Economist, several readers urged us to take a look.  It's a well-nourished chart indeed, one to challenge Kobayashi, but for all that it contains, the reader has to try very hard to find insights.  What with the multiple colors, iron-fisted gridlines, above-and-below boxes, dotted and solid lines, and a legend with nine pieces split in two spots?  Besides, the U.S. boxes grab all the attention by virtue of them being wider (country being more partisan).

The key to unraveling this chart is to identify the relevant comparisons:

  • UK average vs US average
  • UK left vs US left
  • UK right vs US right
  • UK independent vs US independent

And then for the gluttonous:

  • UK right vs US left
  • UK left vs independent vs right
  • US left vs independent vs right

In the junkchart version, we address these comparisons sequentially.

Redo_anglosaxon1a
(Apologies for the tiny font.)

We are again using a small multiples approach that places four comparisons next to each other: average, left, independent, right. Consistently, the British is to the left of Americans.  The only places where the two cultures meet are where liberals agree on "ideology" and "military action".

Also note that we use a symmetric horizontal scale centered at 0.  There are too many charts out there where the center is not at the center!

A similar presentation addresses the other three comparisons.  Democrats in the U.S. are miles to the right of Tories in terms of "religion".  In the UK, Labor and Tories are not much different except on "ideology".  In the US, Independents lean closer to Democrats.

Redo_anglosaxon2a

Joining the lines (I hear the grumbles) helps bring out the gap between the groups being compared.  Without lines, the chart would look like this.

Redo_anglosaxon3a

It is often hard to keep track of which dot is which as they trade order from issue to issue.

PS. Anyone knows what is being measured on the horizontal axis?  The original graph mysteriously stated "respondents' views".


References: 

Eric Talmadge: "Pigout champion Kobayashi limbers up for hot dog gold" June 25, 2004

"Anglo-Saxon Attitudes", Economist, Mar 27 2008.

Apr 09, 2008

An embarrassment

I find it embarrassing for the Economist to print an article like this one.  (Do they have a statistics editor?)

Econ_smoking

The subtitle asserting "causality" is offensive.  It is alleged that smoking bans in bars have "caused" more road accidents because people are forced to drive longer distances to find those bars that still allow smoking.

To assert causality so starkly for an undesigned observational study is unprofessional.  I doubt that the authors of the study they cited even went so far.  At best, they probably found a correlation.

Another problem is the practical significance of the finding.  There is a 13% increase in fatal accident rate in a "typical county containing 680,000 people".  There are two problems with this statement:

  • When I check the Census data, there are only about 85 counties in the entire U.S. with at least 680,000 people.  What do they mean by "typical"?
  • 13% is said to be an increment of 2.5 fatal accidents, presumably per year.  The crane accident in Manhattan a few weeks ago killed at least five people.  I just don't believe that one can prove definitively that such a tiny difference is not due to chance so even the correlation, let alone the causality, is suspect.

It appears that the paper is locked up in pre-publication.  If you have seen it, let us know if the authors actually asserted causality.

Reference: "Unlucky Strikes", The Economist, April 3 2008.

Feb 19, 2008

Color scale

This map from the Economist illustrates pretty well the movement of population from middle America outwards from 2000-6.  The message reaches us despite the large volume of data painted.  (The gray shadow though was more than a little distracting.)
Econ_depop
The map piqued my curiosity in two areas:

How did they determine the color scale?  The average change over all counties (6.4%) was obviously used.  Standard deviation was not since the ranges of change were unequal in size.

Was within-county percent change the best criterion?  As is, an 80% drop in a 2,000-people county looks the same as an 80% drop in a 200,000-strong county.

Reference: "The Great Plains drain", Economist, Jan 17 2008.

PS. I am traveling and so posting will be limited.

Sep 04, 2007

Read fast, pay the price

At first, this looks like a decent chart despite the donut construct, which I cannot stand (but the Economist loves).

Rockstars

The accompanying text proclaimed: "Rock stars are famous for excess, and some pay the price".  The rest of the paragraph points out drug- and alcohol-related deaths, plus deaths due to "unhealthy lifestyles", which apparently include cancer and cardiovascular disease.

There is a gaping hole between what's on the chart and what's in the text.  They just talk past each other.

  • The chart invites us to compare the European experience to the American experience. Each donut presents the proportion of total deaths by causes of death. The top donut presents American rock-star deaths, the bottom European ones. But this comparison has zilch to do with the key point, which is how rock stars are different from the rest of us.  The chart tells us nothing about the rest of us.  The 20% death by cancer would be entirely unremarkable if 20% of non-rock-star deaths also were attributed to cancer!
  • We must also bear in mind that the base populations are rock stars who died young. This is a very specific demographic segment, and so the only valid point of reference are people who died young.  If we think along those lines, then among unmusical people, if they died young, what might have been the causes of death?  Drugs? Alcohol?  Accidents?  Suicide?  You bet.  I am not sure who is the authoritative source of such data but the CDC reported that among Americans aged 15-34 who died, the leading causes were "unintentional injury", suicides, homicides, cancer and heart disease.  Not much different from the above list...
  • The deaths depicted in the two donuts totaled fewer than 100, and yet percentages are given to one decimal place.  This creates a false sense of precision not justified by the sample size.
  • The deaths occurred over about 50 years.  It is very likely that the causes of premature death have shifted during this time span, making an aggregate analysis questionable.

Charting is much more than just aesthetics.  Some basic statistical common sense goes a long way.  This was observed long ago by Huff.

Source: "Rock stars: live fast, die young", Economist, Sept 4 2007.

Jul 21, 2007

Exception to the rule

It's pretty hard to decree hard-and-fast rules for graphical design; every rule seems to admit its exception.  This reinforces Tufte's contribution as he has successfully organized the rules in his collection of books.

Dustin J sent in this chart from the Economist.  Its first impression is ugly and overly complex.

Econ_petrol

Dustin commented:

Steven Few says not to use stacked bar charts because you cannot compare individual values very easily and as a rule I avoid stacked bars with more than six or seven divisions. What do you think of this stacked bar--I think it is quite effective in telling the story.

On this blog, I have also re-done some stacked bar charts but this one is truly an exception to the rule.  The reason why this one works is that it's not about the individual components, it's showing that the US consumes more than all those countries combined. 

If only it has the proper caption!  The Economist is uncharacteristically detached here: "Petrol consumption per day", "Litres bn, 2003".  How about "Goliath v. Davids"?  "US v. the World"? "Dream Team USA"?

It'd help if they tone down the colors; also, by simply annotating the total litres for the US and the total for the other countries, they would have made a clearer point without using gridlines.  But these are minor glitches in an otherwise effective chart.

Source: Economist, July 2007.

Feb 16, 2007

Mirror, mirror

Ec_sarko Mirror, mirror on the wall...

I don't see what the second line adds to this plot, given there were only two candidates in this election. 

Political graphs do not get much better than those at the Political Arithmetik blog.

For instance, in the chart below, he wisely chose to draw trend-lines rather than connecting the individual dots.  TopdemsAlso, typically, he plots dots for all the different polls, which allows us to assess the variability (reliability) of the observed trend.

 

Reference: "Sarko embraces the Anglo-Saxons", Economist, Feb 3 2007.

Feb 12, 2007

Horrid stuff

Ec_smoke Small multiples can work wonders when data are replicated, as in this case.  The chart accompanied an Economist article on pollution levels in several European cities, as indicated by the concentration of nitrogen dioxide and particulates.

In the junkart version, I plotted the data series side by side, rather than one over the other.  Further, the order of cities was according to decreasing levels of NO2, which seemed to be the worse pollutant.  All gridlines are removed except the 30 line which worked pretty well to separate out the highly polluted cities.

Redopollutant An odd pattern has now surfaced.  Namely, there is some degree of negative correlation between the concentration of the two pollutants.  Environmental scientists may be able to tell us why.


Reference: "The Big Smoke", Economist, Feb 3 2007.

Sep 19, 2006

Jamming

Econ_muslimsReaders may have noticed that I'm not a fan of the graphics aesthetics of the Economist.  (I love their subtle sarcasm, a way of saying something without saying it.  For example, the title of this chart is "where they are".  They let us read any meaning into the word "they".  As for their charts, I have taken issue on several occasions.)

This particular example uses one of their standard formats, stacked bars with an extra data series tagged on the right, its boxed annotation calling attention to itself.  It's a case of too much apparatus for a simple task.

The chart's purpose is to show that the US and France have the largest Muslim populations by numbers while France is by far the top country by percentage.

Redo_muslimsOur junkart version is very much cleaner.  Line segments indicating the low, mid and high estimates replaced the stacked bars (which falsely imply significance in adding the low and high estimates).  As usual, the minimum of gridlines and axes is used.  Instead of jamming two ideas onto one chart, if percentages are more important, then a separate chart should be produced, now ordered by decreasing percentages (see below).

The most crucial improvement is the fine print.  Perhaps extending their subtle sarcasm too far, the chart maker omitted context for interpreting the data: namely, that the low-mid-high range represents estimates by up to 5 different sources, each using potentially different methodologies for estimation.  This partially explains the huge variance in estimates for the US (or does it?).

Redo2_muslimsAlso missing is a comment on why these particular 6 countries were selected.  It may give a misleading picture of "where they are" in the context of world population.

Reference: "Where They Are", Economist, June 2006.

 

May 08, 2006

Using data tables

Charts are supposed to elucidate data.  We love charts here but sometimes the love is misplaced.  I noticed the following Economist chart by way of the Truck and Barter blog.

Redounbank

It's a very simple chart, with only 6 pieces of data.  And yet, presenting the data in a table would have been clearer.  One measure of the effectiveness of charts is the amount of time the reader uses to locate the data.  On the table, everything the reader needs require two steps, looking up the right row and the right column.  However, on the bar chart, the reader must first look up the right chart, then the right bar, and then estimate the length of the bar by referencing the axis; if the reader wants the totals, s/he must estimate three lengths and mentally add them up.

Reference: "Into the Fold", Economist, May 4 2006.

Jan 18, 2006

The redundant dimension eye-trick

My friend Patrick was particularly incensed by this chart, from the Economist publication "The World in 2006", which has been discussed here and here.  It employs a typical trick to make charts more "entertaining", that is, introducing an extra dimension, region of the world in this case.  As the right-side junkart version shows, collapsing this dimension results in a much clearer graph.  Disagree?  Try figuring out which columns to contrast in the left chart, and you might get dizzy as if reading an Escher "impossible trident" (more Escher goodies here).

Redogdpgrowth

Reference: "Wider but not deeper", The World in 2006.

Mentions


  • My Amazon.com Wish List

  • Yahoo! Picks

Recent Comments

Search Junk Charts


  • Custom Search

Residues

May 2008

Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31