Beautiful spider loses its way 2
Book review: Data Points by Nathan Yau

Beautiful spider loses its way

On Twitter, Andy C. (@AnkoNako) asked me to look at this pretty creation at NFL.com (link).

Nfl_spiderweb

There is a reason why you don't read much about spider charts (web charts, radar charts, etc.) here. While this chart is beautifully constructed, and fun to play with, it just doesn't work as a vehicle for communication.

This example above allows us to compare four players (here, quarterbacks) on eight metrics. Each white polygon represents one player, and the orange outline represents the league average quarterback. 

What are some of the questions one might have about comparing quarterbacks?

  • Who is the best quarterback, and who is the worst?
  • Who is the better passer? (ignoring other skills, like rushing ability)
  • Is each quarterback better or worse than the average quarterback?

How will you figure these out from the spider chart?

  • Not sure. The relative value of the quarterbacks is definitely not encoded in the shape of the polygon, nor the area. To really figure this out, you'd need to look at each of the eight spokes independently, and then aggregate the comparisons in your head. Unless... you are willing to ignore seven of the eight metrics, and just look at passer rating (below right).
  • Focusing on passing only means focusing on five of the eight metrics, from pass attempts to interceptions. How do you combine five metrics into one evaluation is your own guess.
  • One can tell that Joe Flacco is basically the average quarterback as his contour is almost exactly that of the average (orange outline). Are the others better or worse thean average? Hard to tell at first glance.

***

There are a number of statistical points worth noting.

First, the chart invites users to place equal emphasis on each of the eight dimensions. (There is a control to remove dimensions.) But the metrics are clearly not equally important. You certainly should value passing yards more than rushing yards, for example.

Second, the chart ignores the correlation between these eight metrics. The easiest way to see this is the "Passer Rating", which is a formula comprising the Passing Attempts, Passing Completions, Interceptions, Touchdown Passes, and Passing Yards. Yes, all those five components have been separately plotted. Another easy way to see the problem is that Passing Yards are highly correlated with Passing Attempts or Passing Completions.

Third, the chart fails to account for different types of quarterbacks. I deliberately chose these four because Joe Flacco was a starter, Tyrod Taylor was a backup who almost never played, while at San Francisco, Alex Smith and Colin Kaepernick shared the starting duties. So for Passing Yards, the numbers were 3817, 179, 1737 and 1814 respectively. Those numbers should not be directly compared. Better statistics are something like yards per minute played, yards per offensive series, yards per plays executed, etc. The way that this data is used here, all the second- and third-string quarterbacks will be below average and most of the starters will be above average.

***

From a design perspective, there are a small number of misses.

Mysteriously, the legend always has only two colors no matter how many players are being compared. The orange is labeled Average while the white is labeled "Leader". I have no idea why any of the players should be considered the "Leader".

The only way to know which white polygon represents which player is to hover on the polygon itself. You'll notice that in my example, several of those polygons overlap substantially so sometimes, hovering is not a task easily accomplished.

The last issue is scale. Turns out that some of the metrics like interceptions, touchdown passes, rushing yards, etc. can be zeroes. Take a look at this subset of the chart where I hovered on Tyrrod Taylor.

Nfl_spider_zeroesDo you see the problem? The zero point is definitely not the center of the circle. This problem exists for any circular charts like bubble charts.

Now look at Interceptions. Because the scale is reverse (lower is better), the zero point of this metric will lie on the outer edge of the circle. This is a vexing issue because the radius is open-ended on the outside but closed-ended on the inside.

***

In the next post, I will discuss some alternative presentation of this data.

Comments

Jeff

I don't think you understand why people use these charts. This is multivariate data and metrics don't have same scale. Radar charts are good to compare similarities.So the question you're looking for is not who is better. The question this graph answers is what players have similar styles. And if it is constructed with any other intention, whoever built it didn't know what he was doing either. If you have a multi-dimensional problem, there is no answer to a question like who is better. Because who is better is based on what criteria? When you pick a criteria, then you can do a barchart on that. Or you use a method to take multiple dimensions and convert it to a single metric. In this case, what conversion factor will you use? Maybe to person A, skill X is more important than person B....

junkcharts

Jeff: Rather than talk in the abstract, can you tell us what specific things you learned from this chart that a casual fan of the game won't already know?

lucha

great article! I really love it,
greets!

The comments to this entry are closed.