You have heard it in the media: real-world studies of high-risk people such as nursing home residents or healthcare workers offer definitive proof that the vaccines' true efficacy is <insert a percentage above 90%>!!

These were the people who got the vaccines first, they are at greatest exposure to the virus, more of them are fully vaccinated (two shots), they have the highest risk of death, they are easier to track down. Ergo, the result is rock-solid, beyond criticism!

I wish someone had the foresight, and did a quick randomized trial while delivering vaccines to these groups, then we'll have a good dataset to draw some useful conclusions.

Alas, no one did (as far as we know) - which means all these studies are real-world observational studies. I discussed two examples of these studies (Mayo Clinic, Israel's Clalit health fund), both analyzing everyone in their customer database using the matching technique. Matching is used to create a synthetic control group (after the fact) in the hope of emulating a randomized control group as in a randomized, controlled trial (RCT). After matching, we apply methods commonly used to analyze RCTs, as if the conditions of RCT have been satisfied.

Today, I'm examining a Danish study of the Pfizer vaccine in two high-risk groups, nursing home residents, and healthcare workers. (link). This is an example of those real-world studies of high-risk groups that have so excited the punditry.

**Counting People No Longer Works**

Let's start with a simple calculation, and I'll explain why it fails. The study period is from December 27, 2020 to February 18, 2021, just under 2 months starting from the day Denmark initiated vaccinations. Based on the Figure 1 at the back of the paper, I estimated that during this period, there were 1,675 confirmed cases in Denmark among the nursing home population. The researchers said 382 cases were discovered among the vaccinated nursing-home residents, so there were 1,294 cases confirmed for unvaccinated residents. (The actual numbers used in their calculations are slightly lower because of exclusions but this doesn't matter for what I'm about to say.)

During the study period, 37,172 nursing home residents got at least one Pfizer shot so the case rate for vaccinated is 382/37,172 = 1%. For the unvaccinated group, I use the total number of nursing home residents who have not been infected as of December 27, which was 39,050. So the case rate for unvaccinated is 1,294/39,050 = 3.3%. The ratio of those case rates results in a vaccine effectiveness (VE) of 70%.

This calculation sounds vanilla but has a serious problem. Each nursing home resident who is counted as vaccinated is also counted as unvaccinated - because on December 27, 2020, the first day of vaccinations, everyone is unvaccinated. Then, over the course of time, those who have not been infected are eligible to get vaccinated. Upon vaccination, the nursing home resident transitions from being counted as unvaccinated to being counted as vaccinated. By the end of the study period, only about 5% of the residents are still unvaccinated. So 95 percent of this population are counted twice, and in each case, the persons spend the first part of the timeline as an unvaccinated and then the second part of the timeline as vaccinated.

Moreover, the proportion of time spent as unvaccinated varies depending on the day of vaccination. To deal with this issue, we change the denominator of the case rate. Instead of cases per person, we use **cases per person-week**. Each person is still counted twice, but in any given week, a specific resident is counted just once, either on the vaccinated arm or the unvaccinated arm but not both.

There were roughly 148K person-weeks accumulated on the vaccinated arm, and 164K on the unvaccinated arm, leading to new case rate estimates of 0.26% and 0.79% (interpreted as per 100 person-weeks). This leads to VE of 67%.

(In the second half of this post, I'll show why this 67% number is compatible with a completely useless vaccine. Stay tuned.)

**Matching No Longer Works**

For any real-world study, we ask why we can believe that the group labeled unvaccinated and the group labeled vaccinated are identical except for their vaccination status. In prior studies I featured on this blog, the researchers rely on the matching technique to assert this claim. Matching is infeasible here because the entire study is centered on two high-risk groups, for whom the progress of vaccination was rapid. Among the nursing home residents, about 5 percent were vaccinated during the first week, reaching almost 95% by the end of the study period.

The Danish researchers treated this issue differently from the Clalit researchers.

In the Israeli study, the investigators matched each vaccinated person with an unvaccinated one on a rolling basis through the study period, and they ended follow-up on both members of each pair when the unvaccinated person received the first shot. This matching and censoring scheme creates many undesirable side effects (for details, see my prior post):

- the follow-up time is severely limited because most matched, unvaccinated patients quickly became vaccinated, stopping the count on both members

- cases on the vaccinated arm are not counted because of the early stopping described above

- the number available for matching shrinks rapidly throughout the study period, leaving the majority of high-risk subgroups unmatched, and dropped from the study.

By contrast, in the Danish study, few are dropped from the study (only those who have been infected prior to the first day of vaccination). Each person is followed through the entire study period (until they get infected, died or become untrackable). For anyone who eventually got vaccinated, the person contributes to unvaccinated person-weeks until the day of vaccination, after which the person contributes to vaccinated person-weeks. They don't have extra early-stopping rules triggered by something that happens to an unvaccinated person so every case will be counted (except when they play around with case-counting windows, which I'll cover in a future post).

But there is one big problem with this design, which the researchers recognized and tried to resolve.

**Vaccination Timing Bias**

Because of the design described above, cases counted as unvaccinated typically occur earlier in the study period than cases counted as vaccinated. If you think about any given individual, the time progression is always from unvaccinated to vaccinated. So earlier infection dates are correlated with being unvaccinated, and later dates with being vaccinated.

This is particularly problematic because vaccinations were happening in the backdrop of a general decline in cases post-holiday, at least partially related to a national lockdown that started in Denmark a week before vaccinations began. To deal with this bias, the Danish study added a "**calendar-time adjustment**" to their **regression model**.

Unfortunately, such an adjustment is better than nothing but isn't sufficient. In the rest of this post, I'll walk you through a numerical example to show how this timing bias is a real-world example of what statisticians call a "**Simpson's paradox**". Then, in a next post, I discuss what a regression adjustment does, why it doesn't get rid of the problem, and also address other residual biases.

Simpson's Paradox

For this demo, I start with an assumption that the vaccine is useless (VE = 0). Then, I show that using the usual way that we compute vaccine effectiveness, we will conclude that the vaccine is extremely effective. This "paradox" is of the Simpson's type, well known to statistics students.

As mentioned already, the Danish study attempts to correct for this issue. This demo explains why they must do something about it. It also provides an estimate of the magnitude of this bias (huge).

As background, we start with some real data taken from Figure 1 of the paper: the progress of vaccinations during the study period, and the weekly trend of cases in nursing homes.

Note in particular that the case rate (gray) dropped markedly over the course of the study.

Assuming VE = 0 means each week, the split of cases between the vaccinated group and the unvaccinated group is equal to the split of people between the two groups. This assumption makes the group-level case rates equal each week. The VE in any week is zero since VE is based on the ratio of group-level case rates. Like this:

So far, so good. Now, we compute a cumulative VE using the standard method. We sum the cases and the person-weeks over all weeks, and then divide the two to get an overall VE for the entire study period. This is exactly the calculation I did in a previous section, which led to a VE of 67%.

I hope your head is where mine is: WTF!

We forced VE to be 0% every week of the study period (by assumption), and yet the cumulative VE for the entire period turns into 67%. This is known as a "**Simpson's** **paradox".** The aggregated statistic leads to a startlingly different conclusion than the disaggregated (weekly) statistics.

This is the reason why the Danish researchers introduced a "calendar time adjustment" to their analysis. They realized that they must do something to overcome this paradox.

What's Lurking Behind the Scene

We expected the cumulative VE to be 0% since it's just the average of a bunch of zeroes. The problem is it's not a simple average but a weighted average. The weights reflect the progression of vaccination through the study period.

The unvaccinated group is front-loaded, and shrinks over time while the vaccinated group is back-loaded, and expands. So the aggregate case rate for the unvaccinated group is weighted toward the first few weeks while the aggregate case rate for the vaccinated group is weighted toward the last few weeks. Remember that the overall weekly case rate was high in the first two weeks, and then started declining. This explains why the unvaccinated group with a much higher cumulative case rate than the vaccinated group despite our assumption that the vaccine is useless.

In the above diagram, you see that for the unvaccinated group (pink), the largest weights coincide with the several weeks with very high case rates while for the vaccinated group (blue), the largest weights fall on the most recent weeks with close to zero case rates. Simpson's paradox is a situation in which aggregating the data removes too much information, and distorts the interpretation.

The magnitude of this distortion is extreme! What should have been 0% shows up as 67%.

(In the preprint, I only have the proportion vaccinated at the start and end of the study. In the demo, I drew a straight line, which means I assumed an even pace through those weeks. The reality is that the earlier weeks were probably slower than the later weeks, which makes the distortion even worse.)

***

The situation of rapid vaccination and declining overall case rate are prevalent everywhere so every real-world study, particularly ones focused on high-risk groups, faces this problem.

This analysis does not show the vaccine is useless. It shows that if the researcher does not correct for this timing bias, the estimate of vaccine effectiveness is extremely optimistic.

***

My second book, **Numbersense**, opens with an example of Simpson's paradox, and then it keeps getting better. Get your copy here.

In the next post, I explore how the Danish researchers attempted to fix this bias.

## Comments

You can follow this conversation by subscribing to the comment feed for this post.