« Picking up the right file | Main | Dot com bubbles »



I'm no Tufte or "data visualization" expert, but I would go as far as adding empty circles or something so it clearly shows that some years ALL of the matchups were won by the underdogs. This would be much along the same vein as "non-events (things not happening) contain important information".

Patrick Murphy

1) The left justification in each cell throws me off. Would having the dot(s) centered give a better view? As the eye flows down each column, the column would widen and thin like a river...

2) In the "mildly improved" chart, the Average row is very helpful. But this should have been done using dots, for example, 1 dot plus an eighth slice of a dot. It is a bit jarring to go from visual quantity (the space taken up by dots) to numeric (1.8 takes up as much visual space as 0.0). If the number is needed, it could go flush right in each cell.

3) There are a lot of cells in the background. I see vibrating intersections, just like in one of Tufte's illustrations. In a more detailed improved version, this should be fixed.

PS: Thanks for this blog -- it is very fun and informative!

Tony Kenck

I like your improvements. One piece of information that would be worthwhile for less informed readers is that the maximum possible dots in a square is 4.



I think Patrick's point 1) is unnecessary; left justification ought to be enough. I think his point 3) is the real source of his inability to read down the column. Instead of a grid of cell gaps both horizontal and vertical, maybe eliminate the horizontals? This should preserve the vertical gaps between years, eliminate the visual vibration, and smooth the path of the eye down the column so that center justification is no longer necessary.

Ultimately, what we have here is a simple small multiple of eight bar charts. The NYT presentation, although fun, has obscured that simplicity a little. I don't criticise the dots for not being bars, as I approve of integer bar and column charts emphasising their discrete nature by using separate objects with a 1:1 aspect ratio, or at least, if solid bars, making the bar equal in width to a length of 1.0.

(I haven't seen this idea in the literature, but it's my corollary to Cleveland's thing about line graphs having an average slope of 45°)

Rosie Redfield

The correct reference for Tufte's Challenger analysis is not his first book, The Visual Display of Quantitative Information, but his third book, Visual Explanations (1997).


When deciding whether a dot is an upset based on the number of dots in an average year, would the median dots per column be better than the mean, as means are skewed upward by exceptional years, when we want exceptional years to stand out from the average?

On the other hand, the mean does answer precisely the question "what are the chances of a game upsetting the seed order?". For 9-8 the answer is 50%, so an upset is a total non-surprise. For 10-7 the answer is 45%, so that's an almost total non-surprise. For 11-6 it's about 33%, so now we're getting into surprise territory.

I usually deprecate percentages, as they obscure variable sums by normalising over sum, but here the sum (4.0) is not variable: should we therefore go for a percentage? It would also sneak in the meme that the total is always 4.0, as pointed out by Tony.

I agree that dots would be better than digits for the average (it's not like it will bust the column width, as totals might). That's a plus point for medians, as the fractional symbol, if needed, will always be a half, and not some more complicated object.

Here's my version incorporating some of the suggestions in this comment thread: uses median; uses dots; removes horizontal gaps; includes empty circles. I think the empty circles make the chart too busy, so I made
this version without them.

Andrew Gelman

I'd use the more conventional orientation and put time on the x-axis.


Thanks for making the updated versions. I must rescind my suggestion. I agree it does look too busy.



Thanks for making the updated versions. I must rescind my suggestion. I agree it does look too busy.

Jon Peltier

My problem with Derek's charts is that there is insufficient separation between years, and I have to think too much to consider year as a variable. I think I'd remove the gray background, space the rows more vertically, use solid dark circles for upsets and solid faint circles for non-upsets, and perhaps alternate between dark blue and dark gray for adjacent columns.


I agree with Jon about the dark and faint circles as a way of avoiding the cluttering that those loud open circles caused, but my technical skills failed when I thought about how to do that in an Excel table. That's not an excuse, as good graph design shouldn't be dictated by the technology immediately at hand.

The comments to this entry are closed.

Kaiser Fung. Business analytics and data visualization expert. Author and Speaker.
Visit my website. Follow my Twitter. See my articles at Daily Beast, 538, HBR.

See my Youtube and Flickr.

Book Blog

Link to junkcharts

Graphics design by Amanda Lee

The Read

Keep in Touch

follow me on Twitter