Here are some interesting reading from other places:

Tag clouds have caught on since we approved them a while ago. One interesting use was at the Life Vicarious blog. They use it to compare the inclinations of three New York-based restaurant reviewers. What they should have done is to remove irrelevant words like "one", "also", "many", "make"/"made", etc. In statistics, this is called removing "noise" which helps bring out the "signal".

Andrew Gelman discussed the NYT article that reported the finding of unexpected male bias in the children of Asian American families. He can be counted on to make useful comments on any accompanying graphics. He rightly pointed out that this is one example of not starting at zero: the relevant baseline is 100 since the metric is essentially the over-age of males relative to females. I also agree that a line chart with a longer time series plotting percentages rather than over-age would work better.

The racetrack chart made an appearance at Flowing Data. This one is even more busy and just as impossible to decipher.

The race track chart was the topic of Jon Peltier's Chart Busters on June 25. Here's the link: http://peltiertech.com/WordPress/chart-busters-calorie-chart/

Posted by: teylyn | Jun 29, 2009 at 05:25 PM

The bar should still start at zero. It's just that the quoted number should be the relevant metric, which is the over-age of males relative to females.

The iron rule is that bars should be rectangles of visible length proportional to the number being depicted. If the graph doesn't show the whole bar, or shows a bar longer than the relevant metric, then the the graph is wrong.

This goes for floating bars too. In fact, the concept can be neatly regularized by defining a vanilla bar graph as a "floating" bar graph whose bars happen to float on the base line. You wouldn't conceal parts of a floating bar, or stretch a floating bar down to the bottom of the graph, so why do it to a regular bar?

Posted by: derek | Jun 30, 2009 at 07:29 AM