« How Optimizely will kill your winning percentage, and why that is a great thing for you (Part 1) | Main | Football and statistics, on HBR! »


Feed You can follow this conversation by subscribing to the comment feed for this post.


I looked at some of the data factors: the players in NE and their records of fumbling and the number of fumbles by other players at the running back position. I didn't check receivers.

What I saw is that NE appears to choose lower quality, more ball secure running backs. All the better running backs in the league fumbled much more often. To compare, just meant getting a list and typing each name into the career stats page. Their last true top quality rusher, Corey Dillon, fumbled at the typical rate for top running backs. Their lead back for the last seasons, Stevan Ridley - who was injured much of this season - actually fumbled at the same rate as other top running backs, but he's played fewer years so his totals are lower.

I also checked the small number of times an NE player played elsewhere. I mention this to show the poor quality of some of the additional analysis I've read about this: more than once I've seen that Danny Woodhead's plays per fumbles have dropped since leaving NE. Problem is he fumbled one time in NE and one time in San Diego as a running back (and 1 time each place as a receiver) but played fewer years in SD. He's played fewer years because he's been there less time and because he's been hurt (and now he'll play less because he's also getting older).

In fact, if you go through the numbers, what jumps out is the analysis is basically that Benjarvus Green-Ellis didn't fumble as a Patriot. Replace this one guy with a better running back who fumbles more often and you get no meaningful difference in plays to fumbles. Interestingly, Green-Ellis was signed as a free agent by Cincinnati and asked to take the top running back role and started to fumble. Not a lot but some and this turned out to be the end of his career because he didn't play in 2014, suggesting a guy without top back quality who loses his special ball security ability is toast. (There may be a difference in roles that's hard to quantify: in NE, he mostly ran the ball on safe plays when the team was ahead or when ball security was a major emphasis, but that would require figuring out how many of which type of play, etc.) If the entire argument is about this one guy, that's not an argument at all.


jonathan: That's an interesting angle to investigate causes other than cheating. It is often more helpful to ask the more general question: is there a continuum of risk taking among running backs? (or are there types of running backs that result in more or less risk as measured by total fumbles?)

I also wonder about types of play calls. Another possibility is that NE calls less risky running plays; this is analogous to QBs who are not asked to take any risk (e.g. Alex Smith) and they rack up the plays with few turnovers.

Finally, it's always about the one guy. The question is whether that one guy is special, or that one guy could be any guy.

That's why this is a great example of the challenges of doing reverse causation analysis.


Can we revisit one point. "...the extreme value of New England's plays per fumble performance is not random fluctuation."

It needs to be emphasized that there is no such thing as "random fluctuation". We have done a great disservice in our mathematics education, trying to tell people that there is some magical "chance" that things happen. Random events are simply events whose causes are unknown (as you describe as to what *caused* this measure).

Briggs says it very well: "It is to assume murky, occult causes are at work, pushing variables this way and that so that they behave properly. To say about a proposition X that “X is normal” is to ascribe to X a hidden power to be “normal” (or “uniform” or whatever). It is to say that dark forces exist which cause X to be normal, that X somehow knows the values it can take and with what frequency ... This is all incoherent. Each and every grade Sally received was caused, almost surely by a myriad of things, probably too many for us to track. But suppose each grade was caused by one thing and the same thing. If we knew this cause, we would know the value of x; x would be deduced from our knowledge of the cause. "


Nate: I didn't get into that issue because in this case, the plays per fumble statistic is an average and thus, by Law of Large Numbers, it will be distributed normally. Briggs is talking about assuming normality on the population. That said, I'm ok with Briggs's statements: (a) residual error is often random error, but could be better called unexplained variance; and (b) nothing is distributed exactly as modeled. But I consider (a) not a common fallacy among people who practice statistics; and (b) true but inconsequential when the probability model is appropriate.

The comments to this entry are closed.

Get new posts by email:
Kaiser Fung. Business analytics and data visualization expert. Author and Speaker.
Visit my website. Follow my Twitter. See my articles at Daily Beast, 538, HBR, Wired.

See my Youtube and Flickr.


  • only in Big Data
Numbers Rule Your World:
Amazon - Barnes&Noble

Amazon - Barnes&Noble

Junk Charts Blog

Link to junkcharts

Graphics design by Amanda Lee

Next Events

Jan: 10 NYPL Data Science Careers Talk, New York, NY

Past Events

Aug: 15 NYPL Analytics Resume Review Workshop, New York, NY

Apr: 2 Data Visualization Seminar, Pasadena, CA

Mar: 30 ASA DataFest, New York, NY

See more here

Principal Analytics Prep

Link to Principal Analytics Prep