My review of the Moderna trial protocol (parts 1, 2) shows that by and large, the design follows established standards, with only a few places where the researchers have selected less than the gold standard, which is understandable given the urgency of finding a vaccine.

That said, a rushed job with the desired outcome of a vaccine is going to increase the small odds of a false-positive result. This may come about in unconventional ways.

***

Within each trial, as I pointed out in part 2 of the Moderna protocol review, the threshold for statistical significance has been made drastically more rigorous to account for the interim analysis based on a much smaller sample of data. That's the proper thing to do. The design is generated so that the overall probability of a false-positive across all attempted analyses and endpoints is under control.

Of note, the world is running a dozen or more trials of competing vaccines at the same time. Even when each trial controls the chance of a false positive, the overall probability of a false positive coming from any of the slate of trials is not under control.

Think about elevator maintenance. For any single elevator, the manufacturer might promise a breakdown rate of below one percent. Now, across all elevators of all brands in a single town, the chance that at least one elevator breaks down in a period of time is much higher than one percent. That's because each elevator has a one-percent chance of breaking down, and any elevator may break down at any moment in time. The more elevators there are in the town, the higher the chance of at least one breakdown.

The more vaccine trials are started, the higher the chance that a positive result is a false alarm. No individual clinical trial is controlling for this kind of risk; only the FDA can address this.

***

Another rarely addressed issue is **statistical validity**, that is to say, how representative are the participants of the trial relative to the people who are likely to receive the vaccine once approved.

The vaccine manufacturers are engaged in a complex competition. The first vaccine to get approved has a special status because it becomes the standard of care. Other trials may have to use the first vaccine as the comparator, rather than the placebo. Because of the provision for interim analysis, the sooner a trial completes enrollment of participants, the sooner it can begin analysis, and the approval application. Interim analyses are not triggered by the total enrollment, as in most clinical trials, but by the total number of infections.

Thus, the incentive is to get as many people enrolled as quickly as possible, and to dip into higher-risk areas as much as possible. In the extreme situation, if all enrollment is focused on only a few highest-risk areas, the trial may become the first to accumulate enough case targets. (A possible dent in this strategy is if people's willingness to participate is inversely proportional to how much risk they face.)

While this competition plays out, the trials that have the most biased samples, concentrating on the regions with the highest "yields", will likely win the race for time. This creates a problem for the FDA because the results would score lower on statistical validity, and have to be de-biased. (Such a sample would however provide better data on higher-risk subgroups. The trade-off is that the vaccine is supposed to be for everyone.)

Perhaps the enrollment can be skewed towards severe cases at the start, and then broadened over time to encapsulate the general population. This mimics the likely plan for rolling out the vaccine.

***

I should underline that these are not show-stopping issues. They are interesting statistical side-notes that may seem theoretical when you read about them in a textbook -- until the pandemic happens.

## Comments

You can follow this conversation by subscribing to the comment feed for this post.