The recent surge in Covid-19 cases in Israel and the U.K. has seriously dented the conventional wisdom that vaccines, and only mRNA vaccines, spell the end of the pandemic. If you've been following my blogs on the real-world studies, you should not be surprised that vaccine effectiveness (VE) is not close to 90% claimed by those people. I outlined many reasons why these real-world estimates contain numerous pro-vaccine biases. In this post, I present estimates of the magnitude of several of these biases. (And they are huge.)

I was able to quantify these biases because Public Health England released some real-world data in a recent study (link), which I mentioned in this previous post (link). (Note that I am not replicating the PHE analysis, but using their data to perform the typical VE analysis.)

Here is the relevant table from the study's supplement:

The column labelled "Total" are reported cases of Covid-19 during the study period. I assume that no cases were excluded. Also of interest is the column labelled "Admitted". These are cases resulting in hospitalizations, which serve as a proxy for severe cases.

The table breaks up cases into two age groups (16-49 and 50 and above) and by variants. For my analysis, I combine the variants to get total counts of reported cases of Covid-19.

The following table contains the data used in my analysis.

In addition to the numbers lifted from the UK study, I also estimated the number of people who fall into each vaccination cohort. A vaccination cohort consists of people who received their shots on the same day. The OpenSafely.org project (link) provides estimates of the progress of vaccinations in England, broken up by age group. Starting with those statistics, I derived the number of people in each vaccination cohort. The population estimates are transformed to an index with the unvaccinated cohort set to 100. For example, for every 100 unvaccinated people 50 and older, there are 70 vaccinated people whose dates of vaccination mean they will fall into the (1D to 1D+20) group if they should fall sick.

***

Data analysis should be as simple as possible but not too simple.

The baseline, "first order" analysis is to count all cases that are reported after someone gets the first shot and divide that by the number of people who have taken one or more shots. For the unvaccinated group, divide the cases reported during the study period by the number of people who were unvaccinated.

For the 50+ age group in that UK study, this formula yields a **vaccine effectiveness of 57%**, which means the case rate of vaccinated people is 57% lower than that of unvaccinated people. This shows the vaccines are definitely effective, but one should not expect they will end most infections.

57% isn't the VE number you've read about. One of the reasons is the "case counting window". Some experts opined that vaccines are not supposed to help us until we reach 14 days after the second shot, and so all headline VE values are computed **after nullifying cases outside the case-counting window**. In the UK 50+ dataset, the number of cases in the vaccinated group plunged from 3,647 to 577. That's right, 85% of the cases got stripped from the numerator. This single act upped the VE from 57% to 93%, a value with which you're more familiar.

Nothing changes for the unvaccinated group. That's because the unvaccinated people do not get placebo shots, so that the 2D+14 time point loses meaning. The case-counting window is thus **asymmetrically applied** to remove cases from the vaccinated group but not the unvaccinated group. This remarkable difference between real-world studies and clinical trials has not been addressed.

***

The least complicated way to correct this pro-vaccine bias is to cancel the case-counting window. Doing so results in the 57% vaccine effectiveness calculation which I mentioned already.

A more complicated (and not superior) method is to shrink the denominator in computing the case rate for vaccinated. We thus subtract cases from the numerator as well as people from the denominator. This adjustment changes the 2D+14 VE from 93% to 74%.

The drastic change results from removing 75% of the at-risk population. It turns out that three out of four vaccinated people are not really at risk of getting infected. Let me explain this counter-intuitive observation. It goes back to the case counting window, which introduces a type of "immortal time bias". Any vaccinated person is immortal after taking the first shot until 14 days after taking the second shot. If this person catches Covid-19 during this time, the analyst removes the case when computing the case rate for vaccinated. In effect, this person is immortal until 2D+14. Bias arises because the same counting rule cannot be applied to the unvaccinated group.

The UK dataset contains only cases between 4/12 and 6/4/2021. Anyone who took a first shot after 3/4/2021 arrives at 2D+14 after June 4th and so is immortal in the UK analysis. By my estimation, only a quarter of the vaccinated English people aged 50 and over can contribute to the case count in the 2D+14 case counting window.

The case rate for any group of people should be the number of reported cases divided by the number of people who can become sick. Someone who hasn't reached the start of the case-counting window has zero chance of becoming a case so a principled way to remedy this problem is to exclude the "immortals". (I, however, prefer the simple approach of canceling the case-counting window.)

***

Let's turn attention to the at-risk population for the unvaccinated group. In computing the case rate for the unvaccinated, analysts often use the number of people who are still unvaccinated by the end of the study period. This calculation underestimates who can contribute to the case count, artificially increasing the case rate for the unvaccinated group.

Some of the cases attributed to the unvaccinated group came from vaccinated people before they got inoculated. For example, someone was infected in early May and received the vaccine in June. This case is counted as unvaccinated because the date of infection precedes the date of vaccination but at the end of the study period, this person is vaccinated.

To deal with this bias, I added 37% more people to the denominator of the case rate for the unvaccinated in the 50+ age group. This shifts the VE from 74% down to 64%.

***

For the 16-49 age group, the baseline, simple VE counting all cases is 82%. The headline VE using 2D+14 case window without debiasing is 98%. This number drops to 69% with the aforementioned adjustments to the at-risk populations.

(Remember the VE number is a difference expressed as a ratio. An increase from 10 to 15 and an increase from 100K to 150K are both 50% increases.)

***

In the U.S., the CDC decided to discount mild and asymptomatic cases from the "fully vaccinated" population. Mild cases occurring at any time are nullified for the vaccinated group but not for the unvaccinated group.

The UK dataset contains data on hospitalizations so I can estimate the impact of this change in definition - on top of the case counting window.

The change pushes the 2D+14 VE from 93% to effectively 100%. It also elevates the debiased 2D+14 estimate from 64% to 99% for the 50-plus age group. That's because the number of cases counted for the vaccinated group falls from 577 to 12, a drop of 98%, while those for the unvaccinated group did not change.

***

To summarize, the data from the UK study offer a glimpse at how sensitive the real-world vaccine effectiveness metric is to certain case-counting rules. Specifically, the case counting window and the change in definition of "breakthrough infection" both, each by itself, cause VE to jump above 90%.

Those are by no means the only biases embedded in the real-world VE formula (we have ignored selection bias for example). A realistic estimate of VE is probably closer to 60%, which is an excellent number.

***

P.S. For those interested in the arithmetic, a key concept in this analysis is the vaccination cohort - people who reeceived the first shots on the same day.

For example, someone who got the first jab on 1/1/2021 on average received the second shot 80 days later, which is on 53/22/2021. If this person gets infected prior to 1/22/2021, the case is counted in the 1D to 1D+20 group. However, this case will not matter to the UK study because only cases reported between April 12 and June 4 are included.

If the case shows up between 1/22/2001 and 4/5/2021, this infection is counted in the 1D+14 to 2D+13 group but it happened before the start of the study window. If someone vaccinated on 1/1/2021 becomes sick after 4/12/2021, this case falls within the study period, as April 12th is the 21st day after this person's second shot.

Typo:

For example, someone who got the first jab on 1/1/2021 on average received the second shot 80 days later, which is on *3*/22/2021.

Posted by: Antonio | 07/28/2021 at 06:45 PM

Antonio: Thanks for the note. Corrected!

Posted by: Kaiser | 08/01/2021 at 02:00 PM