So said a reader, Stephen B., of the following graphic (note: pdf) in the London Times concerning Andy Murray's recent tennis triumphs.
How can we disagree? Shocking? Yes. Failure? Definitely. Failing to communicate? No doubt.
Let's first start with the five tennis balls at the bottom. It fails the self-sufficiency test. It makes no difference whether the balls (bubbles) are the same size, or different sizes. Readers will look at the data and ignore the bubbles.
Amazingly, the caption said that "Murray has one of the best returns of serve in the game." And yet, the graphic showed the five players who were better than Murray, and nobody worse! For those unfamiliar with tennis statistics, it does not provide any helpful statistics like averages, medians, etc. to help us understand the data.
(The color scheme from light to dark: first, second, third, fourth round of tournament)
So we're told: the 75% of first-serve points won in the fourth round was 25.6% of the sum of the percentages of first-serve points won from first to fourth rounds (75%+70%+71%+76%). What does this mean? Why should we care?
The challenge with these two statistics is that they are correlated and have to be interpreted together. If a first-serve is won, then there would be no second serve, etc. Here's one attempt at it, using statistics from the Soderling-Federer match. It's clear that Federer was better on both serves.
Reference: "Murray's march to the last eight", London Times.
Andrew N., a reader from Australia, wasn't too impressed with the way National Nine News presents the Olympic medal table on its home page. To the extent that we want to venture beyond the typical tabular presentation, this bar chart is in fact quite appropriate. Let me explain.
Lets take a tour around the world. It's the battle of the data tables.
The Boston Globe's is the cleanest of the bunch. I especially like the way they set up the USA count at the top; the use of country codes is inferior to spelling out country names, as done in all of the other examples. The New York Times is the only one to utilize colors to set aside gold, silver and bronze, which lets readers easily assess the two dominant metrics, total golds and total medals. A small touch but very nice.
The biggest design issue here is the existence of the two different metrics. In any tabular presentation, the countries can be ranked by only one metric so the designer must make a choice. The American papers present ranking by total medals; the French paper by total golds; the two Canadian ones shown here are split. The American papers also choose to carry the ranking implicitly while the others explicitly provide a numerical rank. Le Monde and Globe and Mail provide ranks that are consistent with ordering of countries, both by total golds. The Star, by contrast, wants it both ways: the order reflects total medals while the "POS" column shows total golds. This extra column does help the readers who prefer ranking by golds but the primacy of the other ranking has not been overcome.
So what about National Nine News? I have not been a fan of stacked bar charts but surprisingly, this is a great application. Stacked bars have the disadvantage that the stacked segments don't share the same base and thus it is difficult to compare their lengths. Here, though, our two metrics are total medals and total golds so readers should be drawn to compare the total lengths, and the lengths of the first segments. Those wanting to compare silvers and bronzes must make a stronger effort but they will be in the minority.
What can be improved are the distracting data labels, especially the gold circles. Instead, one should provide a scale, or use symbols such as one circle per medal. (See this old post.) Here is a version with a scale:
One cannot end this post without mentioning the attempt by NYT editors to insert levity into these proceedings with first a cartogram and then a bubble chart.
Todd B didn't like this chart showing the correlation between baseball team salaries and their win-loss records.
A few problems are in plain sight:
Most importantly, putting a second set of logos next to the salaries column would really help
Unclear why the lines should be of varying widths
Winning percentage is more telling than win-loss, especially in the middle of a season when there is a slight imbalance in total games played
the spread of salaries is so wide (10 times) that reducing the numerical scale to rank scale meant a big loss of information
Each column is sorted by its own metric while the most important sorting variable should be the slope of the lines (i.e. the cost per win)
The interactive feature of individual plots for each day (control bar at the top) of the baseball season is something of a gimmick. Props though for realizing that the first few days of the season don't tell us anything. There really is little use for investigating this correlation on a day-by-day basis. Particularly when the salaries are given in aggregate.
On the diagram, the blue lines represent teams such as the Devil Rays and Arizona that had better winning records than their salaries would suggest. Red lines display those teams spending more money than their records would suggest. The steeper the line, the best/worst the team's cost efficiency.
With so many long steep lines in both colors (directions), one might posit that a negative correlation may exist between salary level and winning record.
The following scatter plot suggests otherwise:
The correlation between salary and winning is very weak. If one were to fit a linear model, it would show that the higher-salaried teams generally were doing slightly better (black line). The Yankees were sufficiently outside the range in salaries that I didn't include them in estimating the line. (However, as the chart shows, the line in fact estimated the Yankees winnning percentage really well.)
Teams above the line are performing better than their salaries would lead us to believe.
Junk Charts typically concerns itself with statistical graphics a la Tufte and Cleveland, treating charts as a means to summarize, elucidate and highlight aspects of data. We haven't been too kind on so-called infographics, often finding these cluttered and confusing.Recently, I have a small change of heart.
I now see infographics as innovative in one way, and a complement to traditional graphics. This is the idea of graphs as catalogs. What many of these graphs try to do is to present a structured way for users to explore massive amounts of data. They don't serve the traditional purpose of summarization and that's why they are innovations.
The following chart from NYT tracing Serena Williams' tennis ranking prompted this post.
As a traditional statistical graphic, this chart leaves much to be desired. The general outline of her career could be described in one sentence without the need for any graphic. The colorful vertical lines serve little purpose, nor the short line segments on the other side of the axis.
However, as a catalog of data on Serena's career, this graphic is fascinating. Mousing on the vertical lines changes the information on the top right corner, including the tournament being played and the media event she participated in, as well as photos and her rankings. Similarly, the left and right arrows on the top left allow readers to browse through the list of events chronologically. (You need to click on the link to use the interactive features.) Without this chart, it would have been very difficult to learn about Serena's record at a particular tournament or point in time. It acts like a data table but presents the information in a much more accessible way.
Thus, relying on interactivity, this compact graphic enables any of us to browse to a user-defined depth a reservoir of data. Bravo!
Reference: "Serena William's Professional Career", New York Times, June 2008.
We recently showed an example of when data tables worked well to clarify the data. Last week, there was an example from the Times which did the opposite.
The accompanying article boldly claimed that
the 40-yard dash stands above them all as having the strongest correlation to success in the NFL. The three-cone drill, the shuttle run, the bench press -- none correlate to NFL success. The 40 is king.
Further, it cited Bill Barnwell from FootballOutsiders.com who created an "index" using both 40 time and body weight that is "an even better predictor than 40 time alone". In other words, this formula
does the trick.
The data table, shown above, presumably clinched the case.
We were mystified when we put the data to the test, however. Among the set of 15 running backs, the Index did not predict the Yards Per Carry at all! The Index explained only 8% of the variation in Yards Per Carry between the backs.
The data table obscures this bivariate relationship. As it was sorted by the Index, we would look for the column showing Yards Per Carry to be naturally sorted in the same order. But it is hard to tell the trend from the noise in a table.
What went wrong? It turned out neither 40 Time nor Body Weight had any relationship with Yards Per Carry.
These variables did not explain the range of Yards Per Carry attained by this set of running backs.
Finally, we found strong correlation between 40 Time and Body Weight. (The heavier you are, the slower you run!) This meant that both variables contained similar information and some unlikely formula involving the two would be unlikely to perform significantly better than each variable alone.
So we are left to turn the table on the table. More pertinent evidence is needed to prove the case.
The entire analysis suffers from survivorship bias as only the top
running backs are examined, and no adjustment is made to deal with
wide-ranging tenures. Apparently, there is more data available in a book. There is no indication of how the model shown above was validated.
Reference: "The Race of Truth: 40-Yard Times Can Tell the Future", New York Times, April 27, 2008.
Reader Eduardo is unhappy about the embellishments in this Nikeplus chart of miles ran by day; "pretty but misleading" he wrote us to say. This is a clear case of more is less.
As a data graphic, it doesn't work. The reflections don't work. Perhaps Nike wants to remind all you super-dedicated Nano-wearing runners
what it's like to run in mist or rain! To quote Eduardo: "The bars start at -1! I guess it is motivation." An extra mile for everyone. The rounded corners make it harder to read the level.
Speaking of bar charts, I want to follow up on an exchange from March. In that example, we claimed that not starting bars at zero misrepresented the relative lengths of those bars. The chart showed counts of baseball players implicated in the Mitchell Report by position.
This distortion arises from taking the same length off each bar regardless of the data. As a result, the ratios of the lengths between the bars have been changed drastically.
For example, the ratio of P/3B in the top chart is 31/9 = 3.4 but in the bottom chart, it is 23/1 = 23!
Via Social Science Statistics blog, I found this article in the Times about baseball's longest hitting streaks. The authors ran 10,000 simulations of "baseball seasons using historical data to come up with a probability distribution of the longest hitting streak in each season. They showed the following chart.
The record was 56 consecutive games with hits in a season, which in some circles is seen as unbeatable. These authors -- "in a fit of scientific skepticism -- found that in any season, the simulated longest streak ranged from 39 to 109, with the median at 53 games. They concluded that "the unlikely becomes likely".
That is sure to turn some heads. I have a question for them as I can't make sense of these numbers. A median of 53 meant that 50% (or 5000 out of 10,000) simulated seasons ended up with a hitting streak exceeding 53 games. Empirically, according to here, Dimaggio's was the only one to go over 53. Using the authors' time line of 1871 to 2005, that would be 134 seasons. One out of 134 is 0.75% probability. 0.75 versus 50... sounds like something has gone wrong.
The article doesn't give enough details on the simulation so it is hard to understand what is going on. I hope I am not misinterpreting their analysis.
Source: "A Journey to Baseball's Alternate Universe", Samuel Arbesman and Steven Strogatz, Mar 30 2008.
PS. As readers pointed out, each simulation is of all the seasons. So the histogram is saying that the particular sequence of 134 seasons that we lived to see is not a rarity considering all the possibilities. I'm not sure this is telling us much. It doesn't address the question of how likely the 56-game record would be beat in the future. It can't address this question because the particular sequence is now already set; the alternative universes are irrelevant because we can't jump from one universe to another mid-stream.
Also, readers want to have each hitter's probability be modeled rather than using the historical average; in other words, factor in opposing pitcher, home/away, etc.
I'll throw in another... there must have been an assumption of independence between one game to the next. One would think the pressure would be so much higher on the hitter once he gets to 45, 50, 53 etc. games and it would be inappropriate to assume the hitting probability would remain the same.
Along those lines, why should the hitting probability be treated as fixed, rather than modeled as a probability distribution, which would account for variance as one of the readers suggested?
Long-time reader Jon sent in a different view of the QB data. He uses a nifty tool in Excel to generate a parallel coordinates plot (also called profile plot) on which pairs of QBs can be highlighted and compared.
This chart exploits the foreground background concept very nicely. One way to deal with abundant data is to highlight only those bits that matter to the question at hand, and relegating the rest to the background.
The gray lines in the background provide context without grabbing undue attention. He also converted every metric to a scale between 0 and 1, similar to what we did with our version.
The Eli Manning / Philip Rivers comparison shows that both QBs were below average on most of these metrics, with Manning near the bottom of each.