« March 2010 | Main | May 2010 »

From fellow readers

Have been getting a lot of reader suggestions lately. Thanks to all of you! Some of these, for various reasons, I won't be able to write full posts on. But they are still worth looking at.

Boardingpasswallet Julien D. invites us to interact with a different community, graphic designers -- that's graphics as in fine arts, not statistical charts. This is a great post concerning how they would re-design boarding passes. This is akin to Ed Tufte's effort to re-design bus schedules.

While there is no statistics involved in these designs, the virtues of clean, simple, direct communication of data to people still apply. Pretty is also desirable.


I first saw this infographics of U.S. tax brackets on Felix Salmon's twitter feed (he found this "pretty".) Now, it landed in my inbox via Troy O. My reaction to Felix was "pretty but pointless". Sunset through a smoky lens? Impressionist painting? Scatter plot? Density plot? This one is crying out for your comments.



Julien D., same person as above, also pointed us to an animation of the re-birth of European air routes after the volcanic ash scare.  (Vimeo link)


For this sort of exercise, my main interest lies in admiring the behind-the-scenes effort to collect the data, clean and process the data, code the graphics and animation. I'm not sure the animation tells us anything we don't know by reading the news.

Starving artists deserve better

I agree with reader Craig N.'s assessment that the "visualization" applied to this data made it even more difficult to understand than just the data table.  Much of this is due to failing the self-sufficiency test. There is no graphical element on this chart that can stand on its own without the support of the data itself.


The chart depicts royalty payments to music artists after the Digital Economy Act passed in the UK.  There are a few even larger bubbles at the bottom of the chart including one that is so large as to overstep the borders of the chart.

Lessons from propaganda

Political wordsmith (euphemism) Frank Luntz's presentation is all over the Web. I saw it on Business Insider. In the debate between words and numbers, Luntz obviously takes the side of words.

He used a few simple charts in the presentation, which is interesting by itself since he fundamentally is a words guy, not a numbers guy.

The charts, while simple, are very instructive:

Luntz_bar This bar chart sent me running to the maligned pie chart!  (Almost, read on).

While the total responses were almost evenly split between the three choices, the bar chart drew our attention to the first bar, which is inapt.

If plotted as a pie chart, I thought, the reader would see three almost equal slices. This effect occurs because we are much less precise at determining the areas of slices than the areas of bars.  Wouldn't that turn our usual advice on its head?


How the Bar Chart is Saved

The one thing that the pie chart has as a default that this bar chart doesn't is the upper bound.  Everything must add up to 100% in a circle but nothing forces the lengths of the bars to add up to anything.

We save the bar chart by making the horizontal axis stretch to 100% for each bar.  This new scaling makes the three bars appear almost equal in length, which is as it should be.


Another Unforgivable Pie Chart

On the very next page, Luntz threw this pie at our faces:

Luntz_pie Make sure you read the sentence at the bottom.

It appears that he removed the largest group of responses, and then reweighted the CEO and Companies responses to add to 100%.

This procedure is always ill-advised - responders responded to the full set of choices, and if they were only given these two responses, they very well might have answered differently.

It also elevated secondary responses while dispensing with the primary response.

Reader's indigestion

Kidsdisc1 Reader Chris B. pointed us to this unfortunate chart, based on a one-question on-line poll conducted by Reader's Digest. 

The data is highly structured: for each country, respondents, identified as male or female, are asked about their favorite methods to discipline their kids. (At first, I thought the "male" and "female" meant what methods they would apply to sons versus daughters but based on the summary paragraph, I now feel they refer to the genders of the respondents.)

The textual summary is extremely well-written, and successfully points to the most salient information (my italics and bolding):

Spare the rod, period. That's what parents across the globe told us when we asked how they discipline their children. Respondents in all 16 countries in this month's global survey picked a good talking-to as the best tactic for teaching a lesson, by a wide margin. Taking away a privilege placed second. Two other traditional forms of discipline-sending kids to their rooms and spanking-were the least favored choices in all but two countries. Among respondents who did favor physical punishment, men outnumbered women in every country except Canada, France, and India. Not a single woman in the United States expressed a preference for spanking.


Unfortunately, the graphical summary is a complete failure.

One feature plotting against the designer is that the general profiles of the responses are very similar between countries, and so the differences are well hidden inside this small-multiples display.

It also takes on an elongated form, making it almost impossible to compare the top two countries with the bottom two countries.

When data has such strong structure, it is a blessing to the chart designer. In the first chart, I made a set of profile charts, in small multiples. On average, parents everywhere act very similarly. There are some subtle differences: one common pattern, occurring in the Philippines, Malaysia, India, France, Brazil, etc., is the preference for a talking-to over all other methods; another pattern, applying to Netherlands, Spain, Australia, Canada, etc. is a talking-to, followed by taking away privileges with sparing use of the other two methods.


In some countries, like Australia, Brazil, Canada, Spain, Italy, etc., the gender of respondents mattered little but in the United States for instance, female respondents are more likely to prefer a talking-to while men liked using sticks. 

Is it really the case that parents punish sons and daughters using the same methods? This poll seems to think so.


If we want to expose the minute differences at the level of country-gender, then something like this would do:


The purpose is to surface any outliers. I really can't say there are any here. The supposed reversion of responses by gender in India, France, and Canada is hardly worth noting since the physical punishment category is hardly used. (Reflection of reality, or response bias due to sensitive subject?)

Notice that these new charts do not have the data printed on them - the graphical elements are sufficient to show what the data is; readers are not auditors.

Artistic license


 Frequent contributor Bernard L. pointed me to this National Geographic "infographics". This surely belongs to the Art section of the infographics gallery, which I discussed in the "Whither Infographics" post.  This fact is acknowledged by the editors who labeled this "Art: Fish Pharm".

It's a very pretty picture. And I'm cool to turn a blind eye to:

  • the uneven sizes of the pills
  • the dislocated, non-contiguous areas (diphenhydramine)
  • the dual-colored area (green-yellow), especially as the same green represented a different pill
  • the water bubbles treated as part of the fish

but I'm still debating:

Is it an artistic license taken too far to imply that pharma chemicals have completely stuffed the fish (so much as to also infect the exhaled bubbles) when the text actually said the fish contained "traces of pharmaceuticals and toiletries"?

The footnote apologizes for the percentages not adding up to 100 percent, but 100 percent of what?


And by the way, this is the first time I have seen the word "pharmaceutical" used as a noun to represent medicines manufactured by pharmaceutical companies. As a noun, I understand "pharmaceutical" to mean a company that designs and makes medicines.

Another iPad post

A reader doesn't get why the blogosphere is excited about this infographics on iPad data. He asks: spoof or magnum opus? (This I suspect was what drove Phil Gyford over the edge.)


Let me unbox this example of infographics. (I suspect those in this business are not loving this chart either.)

Ipad_datainkThe section with three columns of pie charts is a classic example of almost-zero data-to-ink ratio, a la Tufte. This whole section contains a grand total of 15 data points; each pie has only one piece of information in it. The names of the competing tablet products are repeated three times, and take up more space than the data.

The titles of the three columns ("Purchase Intent", "Aided Awareness", "Researched Online") are technical jargon used by the market research community.. especially the first two. A footnote explaining what it means will help a lot.

It also fails my self-sufficiency test. The 15 data points are directly printed on the chart next to the 15 pies. Pick one!

And what a pity, there is an interesting story trying to come out from the clutter:

  • The biggest differentiating factor of the iPad and the Kindle is the iPad's much higher awareness among the survey respondents.
  • By contrast, the gap in purchase intent is much less pronounced.
  • And despite the dispersion in awareness, all of these products are quite heavily researched online (is this a problem with the data? is this reflecting some kind of consumer behavior? the impact of shopping comparison sites?)


Ipad_crescents Moving on to the hemisphere section. Again, the data-ink ratio is zero (after rounding).

The two data points were: 300,000 units sold in one day (iPad), and 1 million units sold in 70 days (iPhone).

The only thing you can do with this pair of numbers is to figure out the average number of units sold per day. And even this is a silly idea because the rate of sales follows an exponential decay curve so the averaging over 70 days and comparing that to the first-day sales is essentially meaningless.

The puzzle concerns the relative sizes of the hemispheres. The ratio of the radii is approx. 1.46 to 1, which means the ratio of the areas is 2.14 to 1. I couldn't figure out which numbers give a ratio of 2.14 to 1.


Usageintentions As for the paired column chart, it deserves a pass although whenever I see these, I want to replace them with two lines.

The amusing part of this section is the gigantic number 7 to let us know that survey respondents were asked to rate on a 7-point scale. In the meantime, the data on the chart say "Likely" and "Unlikely", and readers are left to ponder how these categories map to the 7-point scale.

It's also a riddle for readers to figure out how the usage categories were sorted.


April4stats One more while the night is young.

In this section on some statistics from April 4, the larger the underlying data, the less attention it receives.  It's 3000 apps in the store, 300,000 devices sold, and 1,000,000 apps sold.


I must say, it is a good intention to try to make these otherwise dull statistics come to life for people. But the graphical constructs must be in the service of the data.

Whither infographics

Reader Aleks found this infographics poster (by Phil Gyford) which is sure to excite (or exercise) some people.

Infographicsposter I admire the work of infographic artists in processing and structuring huge amounts of data.  I think many such presentations, especially the interactive ones, are terrific in empowering readers (users?) to slice and dice data.

Some infographics are produced by people who probably see themselves as artists first, and the charts as objects of art. (Diagrams of network structure come to mind.) That's fine, too. And I can appreciate them as if I am in MoMA.

Some infographics are daunting works of blood and sweat. They make our jaws drop, we wonder how they did that. They remind us of the Great Wall of China, the pyramids, etc. The layers upon layers of details are there to  dazzle us, to prove the point.


But I like my charts to tell me something important about the data. I want charts to be "self-sufficient". If readers must consult the raw data (printed on the chart) in order to get the message, then all the graphical constructs (bars, dots, axes, etc.) are redundant!

In addition, I don't like to make readers do a lot of work. The task of extracting insights from the data should fall to the designer, not the readers.


That said, judging from the circulation of infographics on the Web, these displays clearly are popular so one can't argue with that.

What's your view of the state of infographics? Love it or hate it?

 PS. Robert at EagerEyes made comments on this topic, likening visualization to a "cargo cult". He calls for drawing a line in the sand. My feeling, as outlined above, is that there could be different classes of infographics -- my personal interest is in those that contribute to effective communication but others may be interested in artistic rendering, story-telling and exploration, technical wizardry, etc.

Book news

Over at the Numbers Rule Your World blog, I just announced that the Kindle version of the book is now available from Amazon. Many of you pestered me about the Kindle version, and finally it's here.

For those not aware, I have been publishing statistics related posts on the sister blog. You can click here or on the tab labeled "Book Blog" above, or on the button on the right, nicely designed by my designer-friend Amanda.

Recent posts have dealt with credit scores used by employers, interpreting the placebo effect as a case of regression to the mean, the collective mis-reporting of retail sales growth by the media, and the intricacies of processing climate data

If you haven't already, bookmark the sister blog, or subscribe to its RSS feed. I also have a twitter page, and even a Facebook fan page.


At least a few of you have read the book, and even contributed a review or two to Amazon. Thank you very much!

The book reviews have been gratifying, and several reviewers made connections to Freakonomics and Malcolm Gladwell. Comments included "easy-read", "engaging", "surprisingly accessible", "honest", "clear and insightful", "a joy to read", "fun", "entertaining", "extremely insightful".

Five readers -- which turned into six due to a logistics issue -- won free signed copies of the book. (Please be patient, the orders are still being processed.) Congratulations and thanks to McGraw-Hill for supporting this effort.

I continue to place signed copies of the book at the McNally-Jackson bookstore in New York City. They also take delivery orders from anywhere in the U.S.


On April 30, I will be giving a talk at NYU's Stern School on "Five Years of Chart Reading" (Kaufman Management Center, 5-90 at 11:30 am). This is a joint event with Dona Wong, the author of The Wall Street Journal Guide to Information Graphics. Please come and see us.