« October 2009 | Main | December 2009 »

Worthy of the Times

Andrew Gelman has put up a great post, discussing how he collaborated with the New York Times editors to transform his chart to a publishable form.  Some time ago, there was a discussion here about the publish-worthiness of my charts (actually, the lack thereof), and I explained that to take my charts to that level would require quite a bit of extra work.  Gelman speaks here from first-hand experience.

Here are the two versions side-by-side.  Readers on the Gelman blog are debating which version is "better" with Gelman himself voting for the revised NYT version:


Let's compile a list of the changes:

  1. Removed 2000 data and relied on only 2004 data
  2. Reduced number of groups of senators from 7 to 5 (with a special calling out of Senator Lieberman)
  3. Re-ordered senators to facilitate classifying into Democrats, Independent, Republicans
  4. Added annotation explaining the grouping of senators; in particular, qualified why the 49 Democrats and 32 Republicans were set aside
  5. Removed scale labels on the top of the chart, retaining only the bottom labels
  6. Vastly increased the area devoted to text labels, which now covered half of the chart
  7. Added white vertical gridlines
  8. Instead of coloring the cross marks, colored the background of the chart; introduced the yellow color for Lieberman (which Gelman originally colored blue)
  9. Removed reference to the national average but explained clearly in the legend that the percentages are relative to the national average.
Of these, 2,3,4 are definite improvements. 1 is acceptable since 2000 and 2004 data are quite highly correlated (note, however, Iowa, Virginia and Ohio.) 7 looks brilliantly here although we typically don't recommend adding gridlines and such. 5 is also acceptable because of 7 -- with the gridlines, one doesn't need to have two sets of labels. 8 can be debated: I'd like to see a version with large colored squares instead of cross marks in lieu of the background; the current look is not bad. 9 can also be mulled over: would it be so bad as to insert in parentheses 73% after the words "national average" in the legend?  I suspect some feel that this may cause confusion because readers now need to understand relative versus absolute values. 6 is the most debatable because the data-ink ratio has been vastly reduced. 

There are several good features of the original chart that they left unaltered, which should also be duly noted:

  • Allowed the positive side of the scale to extend to 10% even though the largest data only reached about 6%. While this creates empty space, it helps readers judge the magnitude of negative numbers relative to positive ones.
  • Subdued horizontal gridlines
  • Retained the title of the chart (I actually prefer that they use the title to summarize the finding as opposed to the current one which takes a neutral stance.)
  • Retained the horizontal scale labels at 5% apart
One final comment on something omitted from both charts: I'd prefer that they plot all the dots for the 49 + 32 senators, on the top and bottom lines respectively, instead of just plotting the group averages.  It would be really interesting to see what the spread is around the 0% point.


Nyt_unemploy_sm In an article called "Off the Charts", Floyd Norris wanted to let readers know that unemployment does not hit citizens equally -- it affects some age groups and men/women to differing degrees. 

As befits the article's title, he included several charts, from which I extracted the one shown on the right.  At first glance, this seems like a normal chart.

But when one pays attention, one notices that the chart is rather complicated.  This chart is like a piece of modern music, in which the composer allows two voices to jar and talk past one another.

Think of it as a data table vying for attention with a bar chart.  The data table is a cross-tabulation of the change in employment by age and gender.  In this view, the men sit on the left, and the women on the right.

Lurking around is a bar chart, for which the point of zero change sits in the middle.  Positive growth extends to the right, while negative growth points to the left.  The gender labels at the top are irrelevant  for this bar chart: the narrow black bars indicate women, the fat colored ones, men.  The data labels are also irrelevant: see, for example, the 45-54 age group, the label for females, at -2.3, should really be placed on the left side of the middle divide!

Here is how these two charts look, disentangled: (I have converted the bar chart to a dot chart.)




Reference: "Off the Charts: Job losses mount, enduring and deep", New York Times, Nov 14 2009.

Text of a novel, centered

How would a novel read if all the text were centered?  I am aware that most Westerners read from left to right, and some Easterners read from right to left.  Not until now am I educated on the third way, which is to center everything.

Here is the object of study, courtesy of the Wall Street Journal:


The point of the chart is to compare Samsung's revenues with those of key competitors in each category.  The comparative information is the difference in lengths between the pair of bars.  This difference is halved and then tagged on to each end of the bar.  This requires reading from outside in.

In order to double the fun, the data labels are pushed to opposite ends of the bars as well, perhaps to let us know that indeed, the companies under comparison are maximally different.

Furthermore, while the key comparison is between Samsung and its competitors, the Samsung bars experience a chameleon shift from blue to green to red to yellow.  The competitors' bars follow suit but wearing checkered skirts.

Because black labels jar with checkered fabric, the fabric must first be bleached before applying the labels (or price tags).

Beyond the amusement, there are some serious problems with the choice of data.  We are never told why Sony, Nokia, Intel and LG Display were chosen as foil to Samsung.  Are they the nearest competitors or the biggest player in the respective segment?  We question whether one quarter's revenue could be representative of the big picture.  We wonder about the effect of currency conversions since the chart encompasses at least three different currencies.  We ask if the definition of "consumer electronics" is the same between Samsung and Sony.

Reference: "Samsung's Swelling Size Brings New Challenges", Wall Street Journal, Nov 12 2009.

People named after drugs

Does any reader have suggestions for getting rid of the spam comments?  The Typepad spam filter has been defeated by an avalanche of spam -- if not, the world has recently fallen in love with names of pharma drugs.


Here are some of my favorite links from other places:

GeneticsA spatial journey illustrating a very long scale, created by the Genetic Science Learning Center (here)

Long scales are very difficult to deal with in charts; I have never been satisfied with log scales since it addresses the designer's challenge of trying to fit everything onto one page, bu does not deal with the reader's need to compare the elements accurately

Not sure how this helps but perhaps some of you will figure it out

Movie_narrative_charts_large Tommi left a comment about this conceptual chart by xkcd, which has been making the rounds.  Fits into our Light Entertainment category.

Says there is no optimal chart type.  A type that works very well for one data set may get hopelessly cluttered for another, similar data set.

Unemploystate From fellow bloggers (especially Jorge), a whole series of views of the U.S. unemployment figures by state over time.  Alternatives that are much more interesting to look at than the typically line chart. Jorge even found something in Excel that looks good.

Following one's nose 2

This is the second post on the immigration paradox study, first discussed on the Gelman blog.  My prior post on the graphing aspect is here; this post focuses on the statistical aspects. I am working backwards on Andrew's discussion points.

Which difference is most interesting?

Interaction 5. Agree with Andrew; they should publish similar analyses on other minority groups as soon as possible.  One thing that strikes me when looking at the interaction plot is that the U.S. born non-Latino whites have a much higher incidence of mental illness.  The difference between different subgroups of Latinos paled in comparison to the difference between non-Latinos and the Latinos.  This latter difference is particularly acute among the U.S. born than the immigrants. The importance of the Latino analysis hinges upon whether the "paradox" is also found among other minority groups.

(Chris P also pointed this out in his comment on the previous post.)

Disaggregation, Practical Significance, and the Meaning of Not Significant

2. Andrew is also right in expressing moderate skepticism about this sort of disaggregation exercise.  He connects this to the subtle statistical point that "the difference between significant and not significant is not significant."  A related but less obtruse issue is that as one disaggregates any data, the chance of seeing variations that stray from the average gets higher and higher.  This is because the sample size is decreasing, and so the statistical estimates are less reliable.

(To give a flavor of the scale, there were a total of 2500 Latinos in the sample, with 500 Puerto Rican Latinos. The analysis drilled down to the level of different types of mental disorders, subgroups of Latinos, and also adjusted for demographics.  The details of the demographic adjustment are not available but in any case, one should be concerned about whether there were sufficient numbers of say, male immigrant Puerto Rican Latinos age 18-25 with income < $10,000 living in a rental apartment, for such an elaborate exercise.)

Expanding on this point further, one observes that the measured gap between U.S. born and immigrant Puerto Rican Latinos was about 5%.  But this 5% is probably of considerable practical significance since the base rate of incidence is about 30% (I say probably since I am not an expert in mental illness).  The current statistical analysis judged this to be insignificant -- if the sample size were larger, this difference could conceivably be statistically significant, and also practically significant.

But, doesn't the significance test deal with the small sample size problem?  Yes, if the authors merely described the Puerto Rico result as inconclusive.  Here, as is done very commonly, insignificance is equated to "no difference": they said

No differences were found in lifetime prevalence rates between migrant and U.S.-born Puerto Rican subjects.

In reality, a difference of 5% was found in the sample that was analyzed.  The statistical procedure found that this difference could have been a result of chance -- notice "could", not "must".  If the measured difference was 0.5% on 30%, then I might be willing to accept a finding of "no difference"; when it was 5% on 30%, I would like to see a larger sample analyzed.

The Meaning of Paradox

1. Andrew was perplexed by why the phenomenon is known as a "paradox". I had the same issue until I read the paper. The authors were a bit sloppy in the abstract. In the paper itself, they explained that the conventional wisdom has it that immigrants should be more likely to have mental illness because of the stress from the immigration process, and yet the statistics showed the exact opposite. That is the paradox.

Publication Bias

I was a little shocked to see the data tables that gave all the estimates of the various effects at the various subgroup levels: shocked because the authors were allowed (or asked) to include only the p-values that were below some unspecified level (which I surmised is 10% although a 5% significance level is used to judge significance as per convention). This is publication bias within publication bias. P-values that are not significant still provide valuable information and should not be omitted. They did provide confidence intervals but for each subgroup separately, rather than for the difference -- and as they noted, such intervals by themselves are inconclusive when they overlap moderately.