I'll be writing a number of posts about observational studies of the Pfizer booster, published by the Clalit health insurer in Israel (back in December 2021), which is the primary source supporting the effectiveness of the booster shot (3rd shot). These studies provide fodder forever for classes in observational studies because (a) there are many nooks and crannies in which residual biases hid, and (b) researchers made many assertions (wrong, of course) about having corrected all biases using the simplest, most conventional methods.
This first post points out the biggest problem with such studies, which is perhaps impossible to correct.
Let's first review the basic setup of such studies. (The devil is of course in the details, and I'll dive into those in future posts.)
Each study has a well-defined start and end date. This is not a clinical trial, so no one proactively enrolls in this study. Instead, the researchers had access to databases with relevant data and extracted the study population from it.
The start date is chosen close to the launch of the booster shot in Israel, and therefore, at the start, almost everyone was in the "no booster" group. Over the course of the study, most of these same people have migrated to the "booster" group - when they took the booster. Even though the study spanned fewer than 2 months, almost everyone (90%) received the booster shot by the end date.
Here is the first thing one must recognize. This is not a typical clinical trial in which one person belongs to either the treatment group or the control group but not both. In this study, nine out of ten people featured in both the "no booster" and the "booster" groups.
So when subsequently, researchers compare the "no booster" group and the "booster" group, they talk about exposure time, rather than number of exposed. The exposure time of the "no booster" group are incurred mostly by people who eventually got the booster within the study period. This is key observation #1.
If we now hone in on an individual who got the booster shot during the study period, the entire observation time is divided into two parts: from start of study to the day of the booster ("no booster" exposure), and from the day of the booster to the end of the study ("booster" exposure). It is 100% certain that the exposure time in the "no booster" group precedes the exposure time in the "booster" group. This is key observation #2.
Moreover, observation #2 deos not depend on the date of booster, which varies by person. Since individually, each person is exposed first to "no booster" and then to "booster", collectively, the exposure time of the "no booster" group is skewed towards the start of the study period while the exposure time of the "booster" group is skewed towards the end of the study period.
What else do we know about the booster study (or, the majority of Covid vaccine studies)? These studies almost always begin during a peak infection period, which means the infection rate dropped during the study. This is key observation #3.
Together, our three observations mean that the exposure time associated with the non-booster group is skewed towards the front end of the study period, which also happens to be a peak infection period while the exposure time associated with the booster group is skewed towards the back end, when infections are much lower.
For the Israel Booster study, the authors essentially confirmed the existence of such bias with the following sentences:
"During this time [ed: study period],the incidence of Covid-19 in Israel was one of the highest in the world."
"The incidence of Covid-19 and thus exposure to SARS-CoV-2 changed during the study period."
Given that at the start of the study period, Israel's infection rate was "one of the highest in the world" and that the incidence and exposure subsequently "changed," this change represents a decrease in incidence and exposure. (The data can be found here. One of the charts shows the 60 and over age group.)
Reflecting the current state of medical "science," the authors didn't make the above statements in order to explain a source of bias. Instead, they used the first statement to explain why their study could achieve statistical significance on mortality due to Covid, which is a rare outcome, and they cited the second statement in order to make the false claim: "we assume that after adjustment for all covariates, including socioeconomic status, these changes had a similar effect in the booster group and the non-booster group."
Their "assumption" I just showed to not hold.
***
This is another example of the "background infection rate" bias that we described in our bias paper (link). In that paper, we used an example of a Danish study which also started when the infection rate peaked and then by the end of the study period, the infection rate had dropped to almost zero.
If the booster were to be completely useless, the average case rate in the no booster group would be higher than the average case rate in the booster group because of differential exposure time against a decreasing background infection rate. This leads to a high vaccine effectiveness when the effectiveness is assumed to be zero!
Adjusting for "all covariates" does not cure this bias whatsoever.
***
What can be done about such bias?
For one thing, better data disclosure. All these studies should show the distribution of exposure time, by treatment vs control group, over the study period. On top of that, we can superimpose the background infection rate. Such an analysis discloses the source of bias.
Next, attempt to do the same analysis for a different time period, especially one in which the infection rate trends in the opposite direction. In practice, this may be hard to accomplish if the public health department decided to launch vaccines around an infection peak.
Comments
You can follow this conversation by subscribing to the comment feed for this post.