Rush defense is a strong barometer of championship potential. More than three-quarters of Super Bowl teams had a top-10 rush defense, according to the Elias Sports Bureau. Nine of the past 14 participants were in the top five.
A Times reporter thus fell victim to the seductive power of data in analyzing the New York Giants' chance of getting to the Super Bowl of American football. In the past seven years, 64% (9/14) of Super Bowl finalists ranked among the top 5 in run defense; since the Giants currently rank 6th, their chance of getting to the Super Bowl must be about 64% (+/- statistical error). Thus, run defense "is" (not "can be" or "was" or "has been") a strong barometer or predictor.
Sadly, Giants fans, I bear bad news. The reporter has put the cart before the horse. He asked the right question: Are the Giants good enough? but used the wrong data. Consider the following two questions:
- Of those teams that made it to the Super Bowl, how many were ranked top 6 in run defense?
- Of those teams that were ranked top 6 in run defense, how many made it to the Super Bowl?
Since the Giants are currently ranked 6th, I'm using top 6 which is more appropriate than top 5 or top 10. These two questions have different answers as the following pair of histograms show.
What is our data and what is our prediction? We want to predict the Giants' chance of becoming a Super Bowl team based on our knowledge of their run defense rank. So putting the horse before the cart, we should use the second chart rather than the first. In other words, given that their run defense ranking at #6, they have about 24% chance of getting into the Super Bowl! (about 1/3 lower than the reported estimate).
To appreciate the difference, one has to realize that in those seven years, 32 teams that were ranked top 6 in run defense did not reach the Super Bowl (as opposed to 10 teams which did).
The reporter's assertion, however, may still hold. A team has a 6% chance (2/32) of getting into the Super Bowl, completely at random. Even assuming that one team in each division is a bottom-dweller with no chance at all, the remaining teams still only have an 8% chance (2/24) of getting there. Thus, knowing that run defense is in the top 6 has tripled our estimate of the chance of getting to the Super Bowl.
For formula crunchers, the above is Bayes rule in practice. Prob(SB/1<=RD<=6) = Prob(1<=RD<=6/SB) * P(SB) / P(RD) but P(SB) = 2/n while P(RD) = 6/n where n=total # of NFL teams. So the correct estimate is 1/3 of reported estimate. I think the histograms above demonstrate the intuition a lot better.