To fans of professional sports, a professional foul is an expected part of the game. In football (soccer), a defender is excused from dragging down - even expected to drag down - an opposing striker who's about to score a goal, deliberately committing a yellow- or even red-card foul. Other players don't complain since they accept their duty when placed in the same circumstance. An analogous phenomenon happens at the end of NBA games. The acceptance of professional fouls turns the offense of fouling from an absolute to a conditional.
(Credit: jpellgen, Flickr)
In my years of reading medical studies during the pandemic, I discovered many professional fouls committed by analysts of clinical trial and observational data.
Today, I'm beginning a series of posts on the final analysis of J&J's vaccine trial (link), which recently appeared in NEJM. I'm going straight to an example of a professional foul.
***
As background, let's review how all vaccine trials have been analyzed. Trial participants are randomly assigned to either the vaccine or the placebo arm. Randomization of treatment ensures that the two arms are balanced on average along any variable, which allows simple all-else-being-equal analyses. This population is typically called the "full analysis population" (technically, it is the "intent to treat" population). The most rigorous studies should use endpoints based on the ITT population, at least report outcomes based on ITT analysis.
The vaccine trials do not go by these rules. All the trials select as primary endpoints results from what is known as "per protocol" analysis. The "per protocol" population is a subset of the full analysis population: there are various exclusion reasons, primarily seropositivity. (According to J&J's appendix, the per protocol set was about 60% 90% of the full analysis population).
The reason why ITT analysis is preferred to PP is the possible introduction of biases due to the various exclusion reasons. Covariate balance may not hold in the PP population.
***
On page 3 of the NEJM article, the authors stated:
"Trial enrollment began on September... Table S2 shows case numbers in each country..."
Flipping to the supplementary appendix, readers find Table S2 which classifies Covid-19 cases in the "per protocol set". Reasonable, given that the primary endpoints in the paper correspond to this subset of participants.
The authors continued:
"The charcteristics of the participants at baseline were balanced between trial groups (Table S3) and were generally representative of the population at rosk for Covid-19 in the United States."
The purpose of this sentence is assure readers that the two arms of the trial were balanced, and therefore we do not have to worry about biases introduced by exclusions. Certainly, anyone who does not flip to the supplement would presume so. If they do open the PDF, they would have found the following table:
Surprise. This table describes the "full analysis set", that is to say, the randomized population prior to those exclusions. The two arms at randomization (not to be confused with at baseline) are theoretically balanced: checking covariate balance in this population is in essence checking whether the random number generation worked as expected, or - that random chance did not generate an outlier event.
What data analysts care about - and expect to see in Table S3 - is the comparison of the vaccine and placebo arms in the per-protocol population, which is the basis of all of the key results in the paper. Balance in the full analysis population tells us nothing about balance in the per-protocol population.
As readers, we are left with two disconnected pieces. We have covariate balance in the full analysis population but no outcomes to evaluate. We have primary endpoints for the per-protocol population but no evidence of covariate balance.
This is what I call a professional foul.
Surely the scientists - who all know how to run RCTs - understand this point but they decided - for whatever reason - not to print the proper comparison. Other analysts may look the other way, because if they were in a similar situation, they may do the same thing.
***
There is one difference from the sports context. Here, the referees - the journal reviewers - are not calling the professional foul; it's not an automatic yellow or red card.
[P.S. 3-1-22 Edited the section about the definition of per-protocol population.]
Comments
You can follow this conversation by subscribing to the comment feed for this post.