The coming commodization of infographics
Lessons from organizing the kitchen cabinet

As good as Bolt

The accomplished graphics team at NYT outdid themselves with this feature on the 100m dash through Olympic history (link). You should really go and check out the full presentation.


About_100m_winnersThey start with a data table like the one shown on the right. It's a boring list of names and winning times by year and by medal type. What can one do to animate this data? The NYT team found many ways.

The presentation consists of a static dot plot plus a short movie.

They found many ways to convey the meaning of the tenths and hundredths of a second that separate the top performers. In the dot plot, for example, they did not draw the actual winning times. Instead, they converted the differences in winning times into distances. Here is the right section of the chart:


We are drawn into compressing time and place, having Usain Bolt race all of the former winners and assuming everyone ran the same race they did in real life. The dot plot tells us how far ahead of each past winner Bolt is.

Some time ago, I wrote about the "audiolization" of duration data, in another piece about a NYT chart (link). They deployed this strategy beautifully at the end of the short film. The runners were aligned like keys on a piano, and the resulting sound is like playing a scale across the keyboard. Lovely, that is to say.



The authors bring in a number of other data points to create reference points for understanding this data. For example, if you blink, you might miss the national jerseys worn by each winner in the hypothetical competition:


 Later, the dominance of American runners is plainly shown via white lanes:


 The perspective hides the relative impotency of American sprinters in recent Olympics. This view of the surge of Caribbean runners makes up for it:



Next, they compared the times for U.S. age group record holders to Olympic winning times. This is a fun way to look at the data. (Pardon the strutting Play button.)


They play with foreground/background here in an effective way. The 15- and 16-year-old age-group record holder is said to be "good enough for a bronze as recently as 1980".

Fun aside, think twice before you repeat this "insight". It falls into the category of those things that sound impressive but are quite meaningless. For one thing, the gap between the two runners is affected by a multitude of factors: the age of the runner (which is elevated here over and above other factors), the nationality of the runner, and the time of the run. This last point is key: if we compare the 15-to-16-year-old 100m record time from 1980 to the winning times of Olympic medalists from that year, the gap would be much wider.

Also, pay attention to the distribution of runners. It gets very crowded very quickly near the top end of the scale. In other words, while the gap as measured in part-seconds may seem small, the gap as measured in individual athletes would be very wide -- we'd find loads of athletes whose times fit into the gap illustrated here.


According to the dot plot, in some years, like the 1950s, there were no gold medalists. Looking at the data here, I think this is an overplotting effect, where two times were so close that the dots were literally on top of each other. This creates the situation where one of the dots will be on top of the other, and which one is on top is a feature of the software you're using. Jittering is one common strategy to deal with this problem, or we can just place the gold, silver and bronze dots on their own levels. The latter strategy would look exactly like the over-the-top view used in the short film:


(We'll also note that this view has time running left to right, which is perhaps more natural than time running bottom up, as in the dot plot. However, we are used to seeing runners cross the finish line from left to right on a TV screen so this is a case of eight ounces and half a pound.)

In the short film, I find the gigantic play/pause button at the center of the screen an annoyance, ruining my enjoyment. (I'm using Firefox and a Mac.)


Now, go check out the entire feature (link), and applaud the effort.


Feed You can follow this conversation by subscribing to the comment feed for this post.

derek cotter

Why do you call line charts bumps charts, and scatter plots dot plots? These are all different things.


Derek: turn this chart sideways so the time axis runs horizontally. Would you call that a scatter plot?

In the short film, I find the gigantic play/pause button at the center of the screen an annoyance, ruining my enjoyment. (I'm using Firefox and a Mac.)

That annoyed me as well. I only discovered that it disappears if you move the cursor away from the video after I had watched it, which isn't all that obvious. (Using Safari on a Mac.)


Kaiser, sorry, I meant to reply seriously to your question nearly a year ago, but work got in the way.

Would I call a time series a scatter plot? Absolutely yes. All time series are scatter plots to me, because time is a quantity, not an ordinal series of categories. Years can be presented as bins of time (which is why you can have a bar chart of years), but I'm sure your years scale doesn't belong in that tradition, even though you elided some years. More importantly, the dots are the record of single events, not an aggregate across events. Had they been "the mean of winning times in the year 19xx", that would have been a dot plot, though I would then have kvetched about the year axis looking suspiciously scalar.

This is something that annoys me about some BI apps, or at least the models users build with those apps, that they fail to appreciate time is not a set of bins. That approach leads to "time stamps" that are treated as categories even though they are specified down to the second, and there are no tools for subtracting times from each other to get intervals between two times, without resorting to convoluted text functions (Business Objects 5.0, I'm looking at you). MS Excel pivot tables gets it right, funny enough. Time is a quantity, and you can perform arithmetic on it. To arrange data into intervals of months or years, you Group the times, whereupon they become categories.

The comments to this entry are closed.