In a prior post, I noted that in a Danish study of real-world effectiveness of the Pfizer vaccine among nursing home residents and healthcare workers, the researchers applied what they call a "**calendar time adjustment**". This adjustment is necessary because most unvaccinated people-hours were accumulated in the early weeks of the study and most vaccinated person-hours occurred in the later weeks -- while the overall rate of infection was dropping precipitously throughout the study period. As a result, even if the vaccine were 0% effective, a simple analysis falsely concludes that it is highly effective (I show how the VE can be computed as 67%.) This is a real-life example of a statistical paradox known as **Simpson's paradox**.

Does "calendar time adjustment" cure the bias? It's hard to say because the researchers did not publish their model. It's not clear how they incorporated the adjustment. Is it a simple linear term? Is it an interaction term? Is time measured in days, weeks, or months? There are many possibilities. Regardless, the calendar time adjustment shifted dramatically the effect of vaccination from 96% (counting only cases from 2D+7) to 64%. (Bottom row of the table excerpt shown below)

But how is the "adjusted" VE to be interpreted?

In reality, this number can't be interpreted without knowing the structure of the model they used. So, I will describe a common type of adjustment, found frequently in the typical social science journal. This adjustment involves adding a linear week-of-study main effect to the regression model. It's one of the simplest possible tweaks.

In the regression model with adjustment, we assume two factors affect the ratio of infection rates: vaccination status, and week-of-study. (By contrast, in the unadjusted model, infection ratios are affected only by vaccination status. This works for a clinical trial in which other factors are balanced by randomization. It assumes no other factors are biased.) These two factors are themselves correlated: the later the week, the more likely someone is vaccinated. Using this model, the researchers found that the VE is 64% *on the first week of the study period. *The regression model also outputs an effect of week-of-study (not disclosed in the paper). For argument's sake, let's say this is - 4% per week (expressed in VE units). This means on the second week of the study period, the VE is 60%, and then 56% on the third week, and so on.

Notice that the drop in VE is 4% between any two consecutive weeks of the study. This is obviously false based on the non-linear drop in overall infection rates from week to week. But if the analyst specifies a linear model, the result is a linear effect. Linearity is a consequence of the structural assumption, not of the data. The above discussion builds in several other structural assumptions*. The success of the adjustment depends on the correctness of these assumptions.

Not adjusting for a clear time bias is definitely bad - and leads to a vastly overstated VE. It's just not clear whether the adjustment is good enough, since the authors did not disclose the structure of their model.

* For example, the reference week does not have to be the first week of the study; it could be the last week, or any week of the study. The effect of week-of-study can vary depending on the effect of vaccination status.

**The Burden of Case-Counting Windows**

I know that the 64% VE estimate is still too high for a different reason. This brings me back to a topic you've seen on my blog a zillion times - the infamous **case-counting window**. Take a look at the results again:

Just like every other study, these researchers computes four different VEs based on four different case-counting windows. As discussed in this previous blog post, the VE increases as you run down the table because more and more confirmed cases are removed from the case count, effectively turning cases into non-cases.

[The red highlight is my own. This is a significant negative VE observed in the two weeks after the first dose of Pfizer, the period that is carved out of most studies. I see this as another real-world confirmation that one dose does not give sufficient immunity.]

Below is a new graphic showing what researchers are actually doing to the case curve as they apply different case counting windows. (I'm taking the definitions of case-counting windows used in the Danish study and superimposing them on the Johnson & Johnson case curve).

In the clinical trials, the same case-counting rules are applied to both vaccine and placebo groups. It's simple enough because the placebo group got two placebo shots.

In a real-world study, the unvaccinated group are those who reach the end of the study period *without having taken any shots*. Therefore, a case-counting window such as "14 days after 1st dose until 2nd dose" has no meaning for the unvaccinated group. As a result, in real-world studies, **cases are nullified only for the vaccinated group**. For each of the four calculations in the Danish report, the same number of cases was used to compute the infection rate of the unvaccinated group while fewer and fewer cases were counted in the vaccinated group as you run down the rows. This issue alone vastly overstates the VE when compared to the same case-counting rules applied from clinical trials. [This problem is avoided in matching studies in which the unvaccinated people are assigned a "day 0". Each methodology has both pros and cons.]

The use of case-counting windows means that **there is zero chance a real-world study can replicate the findings of a clinical trial**. Another problem looms for future studies because Johnson & Johnson is a one-dose vaccine, therefore the frequently-cited window of 2D+14 (14 days after the second dose) is also undefined. I'm sure J&J will cry foul if researchers counted all cases after 1D+14 for J&J and only cases after 2D+14 for Pfizer! Alas, the unvaccinated people are not represented by a pharma although as far as we know, they do pay their taxes.

Nevertheless, I don't believe that the primary purpose of real-world studies should be to replicate the findings of a clinical trial - as I said before, this is like asking a C student to grade the work of an A student. Instead, real-world studies should be used to fill in the gaps of our knowledge.

**On to Today's Hot Potato**

Yesterday, the media breathlessly recited the headlines from a new CDC study of healthcare workers and essential workers, saying that the mRNA vaccines are 90% effective, and 80% effective for "partial vaccinations".

It turns out that this CDC study uses a methodology very similar to the Danish study. So we can apply what we learned from the Danish study directly.

What do those numbers mean?

- Let's start with the positive. Participants in the CDC study submitted weekly swabs for testing, and therefore this study measures infections, including asymptomatic cases. (This complicates any comparison with clinical trial results which do not count asymptomatic cases. We should expect the VE to be lower than in trials if we believe asymptomatic transmission to be an important factor.)
- The 90% vaccine effectiveness is computed for a specific case-counting window that starts from 14 days after the 2nd dose, and adjusted for study-site bias.
- In the VE computation, the case-counting window can only be applied to the vaccinated group. Therefore, the infection rate of the unvaccinated group counts all cases in the entire study period. This makes the estimate over-optimistic.
- During the study period (mid December to mid March), the overall infection rate was dropping precipitously in the U.S. as it did in Denmark. However, the CDC analysis does not adjust for calendar time at all. This is another source of over-estimation.
- The unvaccinated group was heavily biased toward several states with much worse infection rates than the vaccinated group. The published VE estimate incorporates a study-site adjustment. (This is similar in concept to the calendar-time adjustment described above.)
- The unvaccinated group comprises people who are different from those in the vaccinated group on numerous demographic variables. While they adjusted for study site (i.e. residence) bias, they did not correct the gender, race, and occupation biases.

I will be back with a longer discussion of the CDC study soon. The "partial vaccination" analysis requires more space than I have here.

The obsession with 90% is perplexing because a vaccine does not need to be 90% effective to be a useful tool to fight the pandemic. This fixation is creating the wrong impression that anything less than 90% is unworthy.

## Comments

You can follow this conversation by subscribing to the comment feed for this post.