Apr 19, 2008

Cram it like Koby

You have to gradually build up your gut by eating larger and larger amounts of food, and then be sure to work it all off so body fat doesn't put a squeeze on the expansion of your stomach in competition  -- Takeru Kobayashi, six-time champion of the Coney Island hot dog eating contest

Kobayashi is a phenom.  He can stuff 60 hot dogs or 100 burgers in ten or twelve minutes and show no consequences.  Ordinary people can't hope to emulate these feats.

Junk Charts sees Kobayashi as a hero; an anti-hero really.  We are ordinary people; we can't hope to cram it like Koby.  A message we keep repeating here is: too much data sinks a chart.

Econ_anglosaxon Not long after this chart showed up in the Economist, several readers urged us to take a look.  It's a well-nourished chart indeed, one to challenge Kobayashi, but for all that it contains, the reader has to try very hard to find insights.  What with the multiple colors, iron-fisted gridlines, above-and-below boxes, dotted and solid lines, and a legend with nine pieces split in two spots?  Besides, the U.S. boxes grab all the attention by virtue of them being wider (country being more partisan).

The key to unraveling this chart is to identify the relevant comparisons:

  • UK average vs US average
  • UK left vs US left
  • UK right vs US right
  • UK independent vs US independent

And then for the gluttonous:

  • UK right vs US left
  • UK left vs independent vs right
  • US left vs independent vs right

In the junkchart version, we address these comparisons sequentially.

Redo_anglosaxon1a
(Apologies for the tiny font.)

We are again using a small multiples approach that places four comparisons next to each other: average, left, independent, right. Consistently, the British is to the left of Americans.  The only places where the two cultures meet are where liberals agree on "ideology" and "military action".

Also note that we use a symmetric horizontal scale centered at 0.  There are too many charts out there where the center is not at the center!

A similar presentation addresses the other three comparisons.  Democrats in the U.S. are miles to the right of Tories in terms of "religion".  In the UK, Labor and Tories are not much different except on "ideology".  In the US, Independents lean closer to Democrats.

Redo_anglosaxon2a

Joining the lines (I hear the grumbles) helps bring out the gap between the groups being compared.  Without lines, the chart would look like this.

Redo_anglosaxon3a

It is often hard to keep track of which dot is which as they trade order from issue to issue.

PS. Anyone knows what is being measured on the horizontal axis?  The original graph mysteriously stated "respondents' views".


References: 

Eric Talmadge: "Pigout champion Kobayashi limbers up for hot dog gold" June 25, 2004

"Anglo-Saxon Attitudes", Economist, Mar 27 2008.

Mar 01, 2008

Don't believe what you see

Mankiw's blog linked to a press release by the Congressman Jim Saxton, using CBO data to show "middle income tax burden at lowest level in decades".  Cbo_taxrateThe attached graph, as Junk Charts readers will immediately recognize, is classic chartjunk.  Every time the vertical axis does not start at zero,  one suspects something is amiss.  And what with the gridlines and data labels?

"Don't believe it? Check out the data source yourself."  I followed Mankiw's suggestion and was indeed surprised... but not by the great fortune of the "middle class".  The surprise was how the chart painted a dishonest picture of the CBO data.

The original chart plotted only the tax rate experienced by the middle 20% of the population. 
Redo_taxrate1The CBO provided data for all five quintiles; why not plot them all?  In this new chart (right), the "surprise" windfall to the middle 20% proved not to be anything special at all!  All five quintiles, especially the middle three, followed pretty much the same trend over time.  The effect of singling out the middle 20% is to deprive the context by which the data should be interpreted.

Further, what might be the result of the declining middle income tax burden?  Redo_taxrate3 The CBO data painted an unexpected picture.  Paradoxically, as the middle 20% see their tax rate decrease, they also earn a smaller share of the nation's after-tax income (black line at right).  At the same time, the top 1% saw their share of after-tax income double from about 8% to almost 16% (blue line).  The top 20% line is also upward-sloping although less pronounced.  So, the implication that the middle class have had it good is plainly wrong.

What is going on?  Two factors were at play and the Congressman presented
only one side of the story (the tax rate).  What he omitted was that during this period, the nation's wealthy took home larger and larger shares of the pre-tax income.  This shift in pre-tax income more than offset any relative reduction in tax rate for the middle 20%.

This distortion can be traced back to the use of quintiles (or more generally, ranks).  We use them to cope with data having extreme distributions but a by-product is losing information about how extreme are the extreme values.  As demonstrated here, the quintiles from old are really different from the quintiles from today because the underlying distribution has become much more extreme.

Finally, another bit of mystery (to me) is how the middle 20% came to be considered "middle class".  Is there a widely accepted definition?

Reference: "CBO Data Show Middle Income Debt Burden At Lowest Level in Decades", Feb 21 2008.

Feb 10, 2008

Ordering and grouping

The Times reported that January retail sales generally disappointed, and consumers showed a preference for discount retailers over department stores.

Nyt_retailjan


Redo_retailjan

Taking the bar chart on the right, re-ordering by change in same-store sales, and grouping companies by type of retailer, we can present the data to match the text more closely.  The divergent performance between discount retailers and department stores is readily visible.












Reference: "Weak January dashed retailers' gift-card hopes", Feb 8 2008.

 

Jan 22, 2008

Football rankings 1.1

Long-time reader Jon sent in a different view of the QB data.  He uses a nifty tool in Excel to generate a parallel coordinates plot (also called profile plot) on which pairs of QBs can be highlighted and compared.

Jon_garrard This chart exploits the foreground background concept very nicely.  One way to deal with abundant data is to highlight only those bits that matter to the question at hand, and relegating the rest to the background.

The gray lines in the background provide context without grabbing undue attention. He also converted every metric to a scale between 0 and 1, similar to what we did with our version.

The Eli Manning / Philip Rivers comparison shows that both QBs were below average on most of these metrics, with Manning near the bottom of each.




Nov 25, 2007

A dangerous equation

Graduation rates at 47 new small public high schools that have opened since 2002 are substantially higher than the citywide average, an indication that the Bloomberg administration’s decision to break up many large failing high schools has achieved some early success.

Most of the schools have made considerable advances over the low-performing large high schools they replaced. Eight schools out of the 47 small schools graduated more than 90 percent of their students.

Nyt_smallsch This graphic included in the NYT article  lent support to the "small schools movement".  In particular, note the last sentence of the above quotation: it incorporates the oft-used device of subgroup support of a hypothesis, in this case, the subgroup of eight top-performing schools.

Such analysis is "dangerous", according to Howard Wainer, who discusses this and other examples of misapplication in a recent article in American Scientist, entitled "The Most Dangerous Equation".  He alleged that billions have been wasted in the pursuit of small schools.

The issue concerns sample size.  Dr. Wainer and associates analyzed math scores from Pennsylvania public schools.  Wainer_mathscoresAverage scores for smaller schools are based on smaller number of students, and therefore less stable (more variable).  More variability means more extremes.  Thus, by chance alone, we expect to find more smaller schools among the top performers.  Similarly, by chance alone, we also expect to find more smaller schools among the worst performers. 

The scatter plot lays out their argument. Focusing only on the top performers (blue dots), one might conclude that smaller schools do better.  However, when the bottom performers (green) are also considered, the story no longer holds.  Indeed, the regression line is essentially flat, indicating that scores are not correlated with school size.

This is all nicely explained via the standard error formula (De Moivre's equation) in Dr. Wainer's article.  Here is a NYT article from the mid 1990s describing this same phenomenon.

File this as another comparability problem.  Because estimates based on smaller samples are less reliable, one must take extra care when comparing small samples to large samples.

Dr. Wainer is publishing a new book next year, called "The Second Watch: navigating the uncertain world".  I'm eagerly looking forward to it.  His previous books, such as Graphic Discovery and Visual Revelations, both part of the Junk Charts collection.

Sources: "The Most Dangerous Equation", American Scientist, November 2007; "Small Schools Are Ahead in Graduation", New York Times, June 30 2007.


P.S. Referring back to the NYT chart above, one might wonder at the impossible feat of raising graduation rates across the board simply by breaking up large schools into smaller ones.  This topic was taken up here, here and here.  When evaluating the "small schools" policy, it is a mistake to discuss only the performance of small schools; any responsible analysis must look at improvement over all schools.  Otherwise, it's a simple matter of letting small schools skim off the cream from larger schools.

 

Nov 06, 2007

The eyeball test

This set of graphs was used by the NYT to discuss changes in U.S.  spending patterns over time.  For this post, I am focusing on the bottom left and bottom right graphs.  One shows spending on energy as a percent of GDP; the other, on "nonresidential structures" (aka, commercial buildings).

Nyt_spending

At first glance, spending on energy and that on commercial buildings look very similar in shape (see above or below left).  Alas, this "eyeball test" doesn't work very well with time series data.  Lets investigate further.

Redospend1_2

"Standardizing" the data (above right) tells us whether the swings are unusual or not in the history of the data.  So in the 1980s, commerical building spend spiked to more than three times the standard deviation above the historical average.  Generally speaking, the standardized unit of 3 is taken to mean highly unusual. 

Notice that the peaks of the left graph had equal heights but on the right graph, energy spending peaked only above two while commerical building spend rose above three.  This is because energy spending has been more volatile historically so it takes larger jumps (or plunges) to count as "unusual" movements.  This information is hidden in the unstandardized version.

Further, since we are concerned with long-term trends, lets take a look at five-year moving averages (below right): in other words, each time point is the average of the preceding five years worth of data. 

Redospend2

The fluctuations have been smoothed out and the peaks are no longer as high.  Glancing at this chart, we may still conclude that the spending patterns are quite similar -- especially in the period prior to 1995.

But is that really the case?  Zooming in on the 1980s, we may mistakenly think the two lines are "close together" if our eyes read the horizontal distance and/or area between the curves, rather than focusing on the vertical distance.  The arrows on the bottom left chart depict this difference.  To make things clearer, the bottom right chart plots the vertical distances between the two lines.

Redospend3

Observe that the difference expanded to above 1 unit in the late 1980s.  A difference of one unit is very large in the standardized scale (of "unusualness") since 0 is business as usual and 3 is "highly unusual".

Eyeballing the two time series would lead us to believe that the two series are similar but we run the risk of underestimating the differences as illustrated here.


Source: "Auto Sector's role Dwindles, and Spending Suffers", New York Times, Nov 3 2007.

Oct 17, 2007

Points of comparison

Econ_mortgage In light of the current housing crisis, arising from mortgage defaults, I pulled this graphic from a Jan 2007 opinion piece that plotted historical default rates of mortgages.  Notice the high degree of stretching on the vertical axis that exaggerates the volatility: essentially, the annual delinquency rate ranged from 1.75% to 2.65% during the last six years or so.  One might be forgiven to think that a 2% default rate is quite acceptable.

Nyt_mortgage_2 Compare the above chart to the pair that showed up in the NYT in Oct 2007 (see right).  The default rates here are in the 10-20% range, very alarming indeed.

The two graphics illustrate a key issue of "aggregation" in statistical analysis.  The first graphic is super-aggregated: all types of mortgages of all ages are put together to calculate each year's default rate.  The second graphic hones in on subprime mortgages only.

More importantly, the second graphic presents data in "vintages".  Each line represents loans originated during a particular year (a "vintage").  This establishes comparability.  On the first chart, each point in time represents the default rate of mortgages averaged over all ages (some loans may be only a few months old; others may be 15 years old).  Since the default rate is much higher for very young mortgages than for older mortgages, such averaging hides crucial information.

Overall, the NYT graphic very effectively conveys the alarming trend of new mortgages performing much worse, especially those originated in 2007.

Redo_mortgage It can benefit from two slight edits: adding a few more years, and using vertical lines (the most critical comparisons are default rates for loans of a given age!)  Something like this...


Sources: "As Defaults Rise, Washington Worries", New York Times, Oct 16 2007; "Mounting Mortgage Credit Problems", economy.com, Jan 23 2007.

Sep 17, 2007

Structuring a chart

Nytmpg This chart from the NYT was intended to show how the EPA has moved the bar on vehicle mileage ratings: 2008 estimates were lower than 2007 estimates across the board, regardless of manufacturer, model and city/highway.

The chart was built from one basic component, repeated for each model. 
Nytmpgsm_2I like the discreet gridlines (the white ticks) which enable readers to count off the mileage ratings.

The data is rich: ratings were given along three dimensions (model, year of estimate and city/highway).  Readers can benefit from a stronger guidance in where to look for the most pertinent information.  As the chart stands, it is merely a container for the data.  It fails our self-sufficiency test: all the data were printed on the chart, and the bars add little.

In the junkart version, I use knowledge of the data to structure the chart. First, noting that sedans, hybrids and trucks/SUVs/minvans have different levels of mileage ratings, I clustered the models into three groups.  Secondly, the city and highway ratings were separated into two columns as I consider the between-model comparisons more important than city-highway comparisons. 
RedompgThe chart is a dot plot, with a vertical tick for 2007 estimates and a dot for 2008 estimates.  It's easy to see that all dots sit to the left of vertical ticks.

More subtly, we can also see that the hybrids appeared to have been penalized more.  Or perhaps, the higher the rating, the larger the downward adjustment...

Source: "Mileage Ratings Are Still Estimates, Though Closer to Reality", New York Times, Sept 16 2007.

Sep 04, 2007

Read fast, pay the price

At first, this looks like a decent chart despite the donut construct, which I cannot stand (but the Economist loves).

Rockstars

The accompanying text proclaimed: "Rock stars are famous for excess, and some pay the price".  The rest of the paragraph points out drug- and alcohol-related deaths, plus deaths due to "unhealthy lifestyles", which apparently include cancer and cardiovascular disease.

There is a gaping hole between what's on the chart and what's in the text.  They just talk past each other.

  • The chart invites us to compare the European experience to the American experience. Each donut presents the proportion of total deaths by causes of death. The top donut presents American rock-star deaths, the bottom European ones. But this comparison has zilch to do with the key point, which is how rock stars are different from the rest of us.  The chart tells us nothing about the rest of us.  The 20% death by cancer would be entirely unremarkable if 20% of non-rock-star deaths also were attributed to cancer!
  • We must also bear in mind that the base populations are rock stars who died young. This is a very specific demographic segment, and so the only valid point of reference are people who died young.  If we think along those lines, then among unmusical people, if they died young, what might have been the causes of death?  Drugs? Alcohol?  Accidents?  Suicide?  You bet.  I am not sure who is the authoritative source of such data but the CDC reported that among Americans aged 15-34 who died, the leading causes were "unintentional injury", suicides, homicides, cancer and heart disease.  Not much different from the above list...
  • The deaths depicted in the two donuts totaled fewer than 100, and yet percentages are given to one decimal place.  This creates a false sense of precision not justified by the sample size.
  • The deaths occurred over about 50 years.  It is very likely that the causes of premature death have shifted during this time span, making an aggregate analysis questionable.

Charting is much more than just aesthetics.  Some basic statistical common sense goes a long way.  This was observed long ago by Huff.

Source: "Rock stars: live fast, die young", Economist, Sept 4 2007.

Aug 12, 2007

Non-elites

From Mikhail Simkin comes some intriguing analysis of "experts"; in this line of research, experts are compared to the "general public" and often "proved" to be shenanigans. Stock pickers don't do better than apes; economists don't do better than Big Macs; you get the idea.  In a new twist, Simkin puts twelve images of modern art on his website, and asks visitors to distinguish between those by grand masters and those "ridiculous fakes" produced by him apparently on a computer.

Since conventional wisdom says elite universities provide better education, Simkin attempted to find out if there is a difference between "elites" and "the crowd" in their ability to recognize modern art. (Elites, to him, meant the Ivy League and Oxbridge.)  The following pair of histograms clinched his point:

we see that there is not much difference between the elite and the crowd.

Simkin_fakeart


Since the shapes of the histograms are similar, one might be inclined to agree with the statement.  This is until one notes the wildly different scales used because only 143 of the 56,020 quiz-takers could be identified as "elites".

The shapes are clarified if we use a relative scale (percentages) rather than absolute scale.  Further, the difference is more easily seen when cumulative percentages are plotted.  In other words, we are interested in comparing the proportion of respondents who score at least X points out of 12.

Redo_fakeart

Two features are worth noting:

  • A gap opens up between 4 to 7: specifically, 40% of "non-elites" scored 7 points or below while only 25% of "elites" scored 7 points or below.
  • The curves criss-cross around 11 to 12: this shows that "non-elites" were more likely to have perfect scores (although this difference is small).  Perhaps museum directors don't have .edu addresses.

Notice that I plotted Elite vs Non-Elite rather than Elite vs All Respondents.  While it seems innocuous to use "All Respondents", and in this case, there is no noticeable difference since Elites were a tiny proportion, when the test group accounts for a significant proportion of the total, the value for "All Respondents" will be influenced by that for the test group.  As a general rule, compare A to not A.

Simkin's exercise raises many statistical issues of design, which we won't discuss here.

Source: "Properly Prescribed" (via, RSS Significance)

Mentions


  • My Amazon.com Wish List

  • Yahoo! Picks

Recent Comments

Search Junk Charts


  • Custom Search

Residues

May 2008

Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31