Amos Tversky and Daniel Kahneman coined the phrase "the law of small numbers" (1971) to describe the fallacy of "asserting that the law of large numbers applies to small numbers as well." Statistical thinking is based on the belief that a random sample is sufficiently representative of the underlying population, which is justified only when the sample size is large enough.

***

In recent weeks, some researchers and their media allies have touted the ability of vaccines to combat "severe disease". This is based on statistics from the vaccine trials for which they have "small numbers" indeed- literally, fewer than a handful of severe cases in the vaccinated arms in each trial.

None of those trials were sized for measuring severe disease. To measure rare events requires huge samples, for the simple fact that if the number of participants is not high enough, the most likely scenario is zero severe cases. When you've observed no events, it's very hard to give a precise estimate of the probability of the event occurring.

We already have evidence that the claim of 100% protection against severe disease is shaky. Of the trial results that I have studied in detail (Pfizer, Moderna and Astrazeneca), Pfizer's boasted the largest sample size of 42,000 participants across treatment and vaccine arms. In the Pfizer trial, the vaccine efficacy when counting only severe cases was 75% (1 case among vaccinated versus 4 in the placebo group). On the surface, this is already worse than what Astrazeneca is touting (100%).

What is the range estimate around that 75% in the Pfizer trial? This is not for the faint of heart, so only keep reading if you feel strong.

With 42,000 participants, we have enough data to arrive at a precision of... -153% to 100%. You read that correctly, the data are consistent with any VE between *negative* 153% to 100%. Simply put, the estimate of 75% is useless, meaningless, junk science.

Astrazeneca? Their analysis is based on 11,000 participants - pooled across two separate trials. This is one quarter of the sample size of the Pfizer trial. A back-of-the-envelope calculation shows that if eventually, we observed 1 vs 4 cases in the Astrazeneca-Oxford trial (same as for Pfizer), the width of the range estimate would double Pfizer's because of the smaller sample size. If the Pfizer number is junk science, then what is the right word for the Astrazeneca number?

***

Now that the vaccine journey has moved from clinical trials to roll-out, the nature of the data has shifted from carefully collected experimental data to conveniently scavengered observational data, and as a consequence, the media have been flooded with bad takes. I'll have a few more posts on this problem soon.

## Comments

You can follow this conversation by subscribing to the comment feed for this post.