« July 2008 | Main | September 2008 »

Sloppy statistics

As hinted in the previous post, there are rare situations in which pie charts are acceptable; typically, these charts must show proportions that add up to 100%.  If column charts (or line charts) are used instead, readers who aren't careful may assume incorrectly that the columns add up to the whole.

Pie charts show distributions.  How should one state the key message of the following pie chart?

TypeA

I. Type A is the majority.

II. The most frequent type is Type A.

III. Type A is a minority.

IV. Every other type but A form the majority.

I would pick statement II, followed by statement I.  Statement I is the only false statement out of the four if one uses a strict definition of "majority" (more than half).  If one goes by the spirit rather than the word of the law, statement I does pick up the key message albeit imprecisely.  Statement III is a true statement but particularly misleading in the context of this pie chart.  For every type is a minority type if we define "minority" as less than half.  Statement IV is a tortuous way to define a "majority" where there is none.


Neither III nor IV points to a key feature of the data.  It seems ridiculous to even include them.  Lets reveal the underlying data.

TypeA2

Last week, a story coursed through the mainstream media, relating to the above projections published by the Census Bureau.   (Projections were created for 2050 but mention was made of the fact that the largest racial group would account for less than half the population by 2042.)  Here were some of the headlines:

"2042 to see a white minority" (New York Post, 8/14/2008) -- III

"Minorities fixed to become new majority" (Daily Vidette, Illinois State University, 8/20/2008) -- IV

"US set for dramatic change as white America becomes minority by 2042" (Guardian, 8/15/2008) -- III

"...minorities collectively will make up the majority of people in America by 2042..." (Detroit Free Press, 8/21/2008) -- IV

Like I said, statement III is strictly speaking true but by 2042 every race is projected to be a minority.  Statement IV is just odd: of course, if one started adding up enough "minority" types, one will eventually attain majority.

Not all is lost, however.  The following headlines painted a more vivid image:

"Whites to lose majority status in US by 2042" (Wall Street Journal, 8/14/2008)

"White Americans no longer a majority by 2042" (Associated Press, 8/13/2008)


Elsewhere, a Boston Globe column makes an important observation: that Hispanic whites should probably be grouped with whites rather than Hispanics.  Technically, he argued that Hispanic is not a race.  From his point of view, the pie chart looks like this:

TypeA3


Off-sheet accounting

For mid-week entertainment, this full-page ad appeared in the Wall Street Journal recently:

Nasdaq_ad


The vertical axis says "% NYSE of All Market Share Volume".  The time-line is from July 05 to beyond July 08.  The text in the black box is "Matched Market Share: July 11, 2008".

When it's so obvious, it's probably not obvious.  The big story is off the chart: what happened to the other 50% of the volume over the years?

Faced with this, one reaches for the pie chart (... almost).


Small add (8/21/2008):

Redo_nasdaq


Olympic tallies

Andrew N., a reader from Australia, wasn't too impressed with the way National Nine News presents the Olympic medal table on its home page.  To the extent that we want to venture beyond the typical tabular presentation, this bar chart is in fact quite appropriate.  Let me explain.


Ninemsn_tally

Lets take a tour around the world.  It's the battle of the data tables

World_tally  

The Boston Globe's is the cleanest of the bunch.  I especially like the way they set up the USA count at the top; the use of country codes is inferior to spelling out country names, as done in all of the other examples.  The New York Times is the only one to utilize colors to set aside gold, silver and bronze, which lets readers easily assess the two dominant metrics, total golds and total medals.  A small touch but very nice.

The biggest design issue here is the existence of the two different metrics.  In any tabular presentation, the countries can be ranked by only one metric so the designer must make a choice.  The American papers present ranking by total medals; the French paper by total golds; the two Canadian ones shown here are split.  The American papers also choose to carry the ranking implicitly while the others explicitly provide a numerical rank.  Le Monde and Globe and Mail provide ranks that are consistent with ordering of countries, both by total golds.  The Star, by contrast, wants it both ways: the order reflects total medals while the "POS" column shows total golds.  This extra column does help the readers who prefer ranking by golds but the primacy of the other ranking has not been overcome.

So what about National Nine News?  I have not been a fan of stacked bar charts but surprisingly, this is a great application.  Stacked bars have the disadvantage that the stacked segments don't share the same base and thus it is difficult to compare their lengths.  Here, though, our two metrics are total medals and total golds so readers should be drawn to compare the total lengths, and the lengths of the first segments.  Those wanting to compare silvers and bronzes must make a stronger effort but they will be in the minority.

What can be improved are the distracting data labels, especially the gold circles.  Instead, one should provide a scale, or use symbols such as one circle per medal.  (See this old post.)  Here is a version with a scale:

Redo_tally


One cannot end this post without mentioning the attempt by NYT editors to insert levity into these proceedings with first a cartogram and then a bubble chart.

Nyt_map1tally


Nyt_map2tally


The dog ate the margins

In his column on automated polls versus traditional telephone polls, the Numbers Guy at Wall Street Journal gave us a few entertaining quotes.

"The dog could be answering the questions, " Ann Selzer, a traditional pollster, said of automated polling, which occurs through automated voice messages to voter who record responses.  Also, WSJ cited a prominent textbook which labelled them as "Computerized Response Automated Polls -- insulting acronym intended."

Reader Mark A. brought this to our attention because of the following chart.  He wondered what the point of the vertical axis was. 

Wsj_pollsAside from that cosmetic problem, the biggest issue is the lack of explanation.  Predictive power, pollster-introduced error, methodological error: what are these?  The article itself gives no clues.  To make sense of the chart, readers need to consult Nathan Silver's (excellent) site, fivethirtyeight.com.  (The gory details here.)

By the way, Nathan's site has a variety of nicely produced charts.  (Like this one, readers will need to dig around to collect background information to interpret some of those charts.)


Another improvement is to provide some sense of the variance in the data, either by showing more than the top five pollsters or by showing the range of errors.  Since the average pollster sits on the right edge, it is as if the right half of the chart was clipped.  In the version below, we found most polls hovering around the average, with two egregiously bad.

If we know which polls are automated and which aren't, then color the dots accordingly.

Redo_polls

There are bench players on every chart: these are the titles, axes, labels, text and so on.  They provide background information required to interpret the chart.  They may sit in the margins but their value is not to be underestimated.

Don't let the dog eat the marginal information.

Reference: "Press 1 for Obama, 2 for McCain", Wall Street Journal, Aug 1 2008.


A tale on two charts

By now, everyone knows subprime mortgage lenders in the U.S. are in a world of hurt.   The following pair of charts illustrates how serious the problem is.  Lenders track the proportion of borrowers who are "60-day" and "90-day" delinquent, meaning late or no payment in the last 60 or 90 days.  Lenders count on a contained delinquency rate in order to run a viable business.  These loans stretch for often 20-30 years so it is crucial to catch problems early.

Delinquency
This trend can be seen in the IMF chart (right).  Take the 2000 vintage of subprime borrowers.  The peak 60-day delinquency occurred around 45 months after loan origination, and then tailed off.  (A 60-day delinquent borrower will eventually become 90-day delinquent, or less likely, non-delinquent, in either case causing the curve to tail off.  The tailing off feature is, in fact, undesirable, and can be removed by plotting delinquency rates of 60 days or more, as opposed to just 60 days.  This is what the NYT chart on the left did.)

What do these graphs say about the current malaise? Nyt_delinquency2 The NYT chart (right) carries its information in the relative slopes of the lines.  The steeper is the line, the quicker borrowers are becoming 90+ days delinquent. Another piece of information is when each curve starts to take off: the 2007 curve lifts off much earlier in the year than the 2005 and 2006 curves.  This last point is less secure because the graphic does not preclude the scenario in which loan origination is biased toward the latter part of the year.   Similarly, the crossovers between one curve and another tell us the extent of the problem compared to the past but again, the reader has to do much work to learn this information, as shown.

The IMF chart took a different view.  If we selected only the 2005-2007 curves and shifted the 2006 curve to the 12 month point, and the 2007 curve to the 24 month point, we would have recreated the NYT chart.  (We gloss over the matter of counting 90+ days rather than 60-day delinquency, and the other matter concerning the recency of the data used.)  Here, the key information is coded in the vertical distances between the curves.  Taking the vertical as shown below, the reader can see that the 2006 and 2007 vintages have performed almost twice as badly right off the starting gates while the 2005 vintage looked normal but worsened significantly after the two-year mark. Imf_delinquency2

For lenders monitoring performance, the IMF chart is much more useful.  For someone wanting to know the current state of delinquency, the NYT chart is easier to work with.  It must be said that it is always better to plot more history and longer time horizons (like the IMF did).


Finally, can someone please prove a four-color theorem for graphs?  Spraying rainbow colors on charts is a bad habit (for example, also in the house price index charts).



Reference: "Housing Lenders Fear Bigger Wave of Loan Defaults", New York Times, Aug 4 2008; "IMF Sees World Growth Slowing, with U.S. Marked Down", IMF, Jan 29 2008.


A graphical playground

RichmondTom pointed us to this New York Times article, which made what I consider a sloppy argument for the inflation rate being over-stated rather than understated.  This article was accompanied by a colorful and playful chart to describe which components make up the index, and which price changes were most drastic.

Nyt_inflation 

Like most "infographics", this chart has an interactive feature which you need to click here to sample: mousing over the strangely-shaped areas leads to annotations describing the weight of the relevant component, and the change experienced in that component during the past year.  Like Sherlock Holmes holding a magnifying glass, the reader can use the Zoom In, Zoom Out buttons to drill down.

Here is an expanded view on the color scale, with my comments.

Nyt_inflation_color We have no issues with the blue or the red & orange regions but what with using light blue and light green for increases between 0 and 4 percent (when all other increases use red)?  Readers not looking at the color palette will be caught off guard, misinterpreting the data.  This is a big problem since those two appear to be the most frequently occurring colors on the chart.

The designer of this chart surely is aiming for entertainment, creating a playground for the data sleuth.  It succeeds if readers take up the invitation to explore the data.  Unfortunately, it is not the most accessible for more sophisticated numerates interested in understanding the data.

It is impossible to ascertain the relative size of pieces that are close in size but often in arbitrary shapes.  We can roughly read out which components are most important to the index: turns out to be "owner's equivalent rent" by a mile.  Owner's equivalent rent, as discussed by Leonhardt, is the most problematic and inaccurate component, and also, I might add, most susceptible to manipulation since it is an estimate, not a measure.  The next biggest pieces -- food and gasoline (excepting rent) -- are taken out because of "volatility".

The most interesting information in this data set is missing: how much does each component contribute to the overall inflation rate?  This requires multiplying the weight of each component with the change of each component.

It would also be helpful to label which components are taken out of the Bernanke calculation that claims only a 4% inflation when my cable company seems to have raised prices every six months, and groceries and meals and so on are rising at 10-20%.  (Of course, Leonhardt thinks this is all loss aversion.)



Reference: "Seeing Inflation Only in the Prices That Go Up", New York Times, May 7 2008.


Proof of rampant U.S. deflation

From San Diego Tribune via the Big Picture, irrefutable proof of steady deflation due to enlightened government policy!

Sd_inflation

Fodder for thought for those curious about U.S. economic statistics.  While conventional wisdom (or publicized rationale) often claims that the recent adjustments to the core methodology remove components that are "more volatile", the evidence here suggests that the new method basically removes 3% from the previously computed rates.  Remarkable how stable this difference is over time.

This is a fantastic chart that makes its point clear and loud.  The secret is in picking the right comparables.  To vet the data analyst with shifting metrics, it is not necessary to prove that the new metric is or is not more accurate than the old.  Oftentimes, by tracking both the old and the new, the effect of the change is revealed.  Here, the adjustments just keep taking the rates down.


Reference: "The Fed's inflation guage isn't realistic, critics say", San Diego Union Tribune, April 17, 2008.


Rant of the day: Typepad's new editor continues to impress -- changing font size wipes out all hyperlinks in the text being formatted!