A reader of my blog, Joran E., pointed me to this great article (by Ross Tucker) that covers one of the newer anti-doping measures (the biological passport), which links to this recent NYT article on two Italian cyclists found guilty of doping. So while I was researching this latest development, I came across the latest legal maneuvres in the case of Alberto Contador, the Spanish cyclist and multiple winner of Tour de France who tested positive after last year's victory, and subsequently blamed a contaminated steak (I mentioned his case here last year).
Anti-doping provides a perfect back drop to revisit all five statistical concepts that form the spine of my book, Numbers Rule Your World.
The most potent form of doping these days is human growth hormones (HGH), EPO and similar compounds that have the characteristic of occurring naturally so that labs must seek to separate dopers from people who have "natural highs". By contrast, for compounds that don't occur naturally, such as clenbuterol that ensnared Contador, even minute amounts can be proof of wrongdoing.
In order to know what level of a compound is "unnatural", statisticians need to establish what is natural. This is the concept behind Chapter 1: we calculate the "average" (natural) value, but focus on examining variations around the average.
Admitting that the natural value is not uniform across all people, statisticians determine different averages for different "types" of people; the simplest such subgroups would be male/female, and age groups. The biological passport takes this idea to the extreme: each individual athlete is tracked over time to establish his or her own average. We just put into practice the concept behind Chapter 3, which is to avoid lumping together things that are different.
The cases of the two Italian cyclists represent the first two in which athletes have been punished based on evidence from the biological passport. Previously, the enforcers need a failed drug test or a police bust to convict dopers.
The developments in the Contador case are very discouraging: the Spanish cycling federation showed an unwillingness to expose the biggest star of the sport, first by assessing a one-year ban (when the norm is two years), and most recently, overturning that shortened punishment. USADA, the US anti-doping body, expressed its concern here, and the reversal of the ban is under appeal.
The Spanish authority accepted the Contador camp's explanation of unintentional consumption of tainted beef as the reason for testing positive. Statisticians who believe in the logic of hypothesis testing will find such a conclusion absurd.
Let's walk through how we apply the logic as described in Chapter 5 of Numbers Rule Your World to this situation. Assuming that Contador did not dope, what is the chance that minute amounts of clenbuterol would be found in his body? Unfortunately for Contador and other athletes failing this drug test, the chance is vanishingly small.
Like most accused dopers, his camp did not challenge the presence of clenbuterol; they merely offered an alternative theory for why it was there. A large number of coincidences had to occur in order for their theory to be believed: beef had to be taken from Spain into France to serve Contador, and only Contador (not any of his teammates); a different source of beef must have been used on other days during the Tour on which Contador ate beef (since he tested negative on most other days of the tour); he was one unlucky fellow since anti-doping tests have high false negative rates in general, and he managed to test positive on that one time he ate the contaminated beef; he was also extremely unlucky since Europe banned the use of clenbuterol to raise cows in the 1990s, and the beef he ate on that one occasion had to have come from an unscrupulous farmer violating the ban.
Statisticans would politely listen to all that, and declare "rare is impossible". It's much easier to believe that he was doping. (We would admit that there is a miniscule chance that the conclusion is incorrect -- the chance is precisely that of those coincidences occurring.)
Why does the scientific process disintegrate into this sort of he-said-she-said argument?
The concept behind Chapter 2 proves useful here. The statistical model that links the biological passport and/or the drug test to doping is one based on correlation, not causation. The passport or drug test does not provide direct evidence of doping (unlike a police bust). But as I point out in the book, correlational evidence can be powerful, and has been profitably used in all kinds of decisions. Because clenbuterol is not produced naturally in the human body, this test result is very close to causal evidence; it's less secure for things like EPO and HGH.
It's just more complicated when causal evidence is unavailable because people can now advance all sorts of hypotheses to explain the correlation. We then get story time, a phenomenon I frequently discuss on my blog. I'm happy to hear the stories but one must seek evidence to support these stories.
In the Contador case, for example, I'd like to see evidence that the steak was eaten, receipts from the vendor who imported the beef, documentation of which farm raised the cow, inspection of the farm to confirm that it used clenbuterol, traceback of beef from that farm to find the presence of clenbuterol, etc. In none of the reports on this case have I seen any of this evidence, and more disturbingly, the supporters of Contador don't appear to be asking any such questions. (See, for instance, Christian Josi on Huffington Post.)
Why would statisticians accept the chance of falsely accusing a clean athlete, however small that chance is? This is because we know that there is no such thing as a perfect test. The only test that will never yield a false positive is the test that never issues any positive results!
We already accept this type of situation in the Western legal system. A criterion of "beyond reasonable doubt" in the courts does not guarantee no wrongful convictions. In fact, thanks to the work of groups such as the Innocence Project, we know that some unfortunate people are wrongfully convicted, sometimes for long sentences for grievous crimes they did not commit.
As explained in my book (which I won't repeat here), the real issue in anti-doping is not about false positives but about false negatives. I fear that the entire system is so lenient toward dopers that they would take the (small) risk of detection. I'll make a case for this in a future post.