Today and tomorrow, I’m taking you through another real-world study which has become the favorite of mainstream media, the study by Israel’s largest health fund (insurer), that confirms the effectiveness of the Pfizer and Moderna vaccines. The report has appeared in the prestigious New England Journal of Medicine (NEJM) here (with supplements).
It has come to my attention that my readership is split between those who see my commentary on scientific papers as an anti-science provocation, and those who value reading my notes and have requested more. The goal of my blogs has always been to promote critical thinking, to encourage chewing before you swallow information. My target readers are people who want to stay for the entire movie – even to visit the sets – after watching the trailer. If you’re here, that probably describes you.
What is a Real-World Study?
The Israel study by Clalit is an example of a real-world observational study. I covered the basics of such studies here, and also described a similar study by the Mayo Clinic – nference team here.
In brief, real-world studies analyze data that are collected by healthcare systems, which are much more challenging than experimental data tracked during randomized clinical trials (RCTs). In real-world studies, the difference between the vaccinated and unvaccinated individuals is not just caused by vaccination, but it may be explained by a myriad of other factors, such as age bias, geographical bias, and socio-economic bias. In an RCT, because we use coin tosses to determine someone’s vaccine eligibility, all other factors are equalized, avoiding this mess.
The Clalit study adopts the same high-level strategy as the Mayo Clinic’s, manufacturing a synthetic control group that mimics the placebo group in a RCT. All vaccinated individuals during the study period are considered. Each vaccinated customer (who survives the exclusions) meets a matching algorithm that attempts to locate an unvaccinated individual who is similar on a list of factors, such as age and sex. Unmatched people are dropped from the analysis set. The vaccinated and unvaccinated subgroups can now be treated as if randomized. The researchers then compute vaccine effectiveness (VE) using various case-counting windows, and outcome metrics. They select one of these analyses to highlight on the front page.
For Clalit, they conclude that the real-world vaccine effectiveness was 57% during the case-counting window of 14 to 20 days after the first dose (1D+14 to 1D+20), under the criterion of “symptomatic Covid-19” which is closest to what the vaccine trials were measuring. Surprisingly, they did not mention until later the effectiveness from 7 days after the second dose, which was 94%. This number looks like the VE reported by Pfizer during the vaccine trial, even though the study populations are different.
Footnote: Yet another real-world study contradicts the lobby who’s been pushing for single doses or delaying second doses after “re-analyzing” the Pfizer case curve to argue that its vaccine yielded 90% efficacy after a single dose. For what it’s worth, the Mayo Clinic study reported 70% effectiveness in the comparable time frame as Clalit’s; neither number is close to 90%.
In both studies, the day of the first shot determines Day 0 for a vaccinated person, and sets the day of “study enrollment” (“recruitment”) for the matched unvaccinated individual. As I explained before, fixing a “Day 0” for the unvaccinated is a necessary hassle in these observational studies, a problem that does not arise in a randomized clinical trial in which people in the control group are given two placebo injections. As with the Mayo Clinic study, the date of infection of a matched control depends on the identity (and thus Day 0) of the vaccinated individual, when the algorithm selects one out of multiple possible pairings.
Matching Mechanics
The details of the matching process vary between the Clalit and Mayo Clinic studies. The Israeli insurer requires a stricter form of matching (“exact”) while the Mayo Clinic uses the propensity scoring approach. The propensity score distills a set of matching variables into a single dimension, so that the matches are approximate. For example, all matched pairs in the Clalit study must be the same sex. In propensity scoring models, it’s permissible to pair a male with a female – so long as the matched pair is sufficiently similar on other factors.
Stricter matching offers better assurance that the matched pairs are sufficiently similar to each other. However, statistical adjustments come with trade-offs. The yield of Clalit’s stricter matching process is lower than for the Mayo Clinic. As far as I can tell, every vaccinated individual at Mayo (excepting exclusions) is successfully matched. By contrast, only four out of ten vaccinated individuals remain in Clalit’s analysis set after the exclusion and matching process. The high drop-off rate presents formidable challenges to generalizing the study’s findings.
“All persons who were newly vaccinated during the period from December 20, 2020 to February 1, 2021, were matched to unvaccinated controls in a 1:1 ratio according to demographic and clinical characteristics.”
I sighed when I read this sentence which initiated the second paragraph of the NEJM paper.
If I am harsh, I’d say the quoted sentence is false. But such careless language routinely appears in journals in this field, including the Lancet, NEJM, BMJ, etc. The sentence implies (wrongly) that the findings of the real-world study applies to all Clalit clients who were vaccinated before February 1, 2021. The media have taken the liberty of extending this result to all vaccinated Israeli since half of Israel are members of Clalit.
The key data are to be found in the Supplement (Table S2), the top part of which is shown below.
The first column concerns the “full” vaccinated population while the third column describes the “matched” vaccinated population. In the table header, we learn of a drastic drop-off from 1.5 million to 597K. This 60% deletion is in large part due to the incurable imbalance problem I mentioned before (see here). Only two out of every five vaccinated individuals remain in the analysis set.
When reading real-world studies, we must always guard against our natural instinct to assume random (see this prior post). Many of the assumptions we make about randomized clinical trials (RCTs) cannot apply. If the analysis set were a random sample of 40 percent of the “full” data, the study’s findings can be generalized to the full dataset. But in an observational study, the analysis subset is decidedly not a random sample of the full population – remember, the point of matching cases to controls is to get rid of unwanted biases that exist in the full population! Therefore, this study’s finding concerns the matchable sub-population (the third column), and not all vaccinated Israelis.
Eventful Exclusions
The research team behind the Clalit study faces an insurmountable problem. The pace of vaccination in Israel is too fast for this type of studies. They quickly run out of unvaccinated individuals for matching, especially people in groups prioritized for vaccinations. These include health care workers, adults over 60 years old, and adults with coexisting conditions.
Look back at Table S2. Nine percent of the vaccinated people are 80 years or older, and this age group got trimmed down to only 4% in the analysis set. Meanwhile, a quarter of those vaccinated were under 40 (which is surprising given the priority rules), and this age group was over-weighted to 36% after the matching process. The matching process has introduced a strong youth bias into the real-world study. This must inform our understanding of the VE estimates.
Exclusions also cause the analysis set to deviate from the full set of vaccinated customers. Figure 1 in the NEJM paper provides partial information. Some highly consequential exclusions include people living in nursing homes, people confined to their homes, and health care workers. Because these high-risk segments have been omitted, the estimates of vaccine effectiveness are exaggerated if extrapolated to the general population. These exclusions also make any findings on hospitalizations, severe cases and deaths hard to stomach.
Of note, the original study protocol did not contain those specific exclusions. I’m not convinced by the justification of “high internal variability in the probability of exposure or the outcomes.” Another influential exclusion that was subsequently appended removes people who “had a health care interaction within 3 days before the vaccination date”. This one exclusion rule knocks out 13% of the vaccinated individuals (202K), or one-fifth of all exclusions. The justification for this rule is that those interactions “may indicate the start of symptomatic disease and may preclude vaccination.” I don’t know what to make of this since the rule was applied to the vaccinated group (Figure 1), and it appears that the excluded individuals have a high chance of getting infected after vaccination. We don’t know how they define a “health care interaction” or decide on 3 days, which was not pre-specified.
The full list of exclusions are: under 16, not a customer of Clalit, not a “continuous” customer of Clalit, a positive PCR test prior to December 20, 2020, residents of nursing homes, people confined to the home, health care workers, missing home address, missing BMI, missing smoking status, people who had a health care interaction within 3 days before vaccination.
The matching factors include: age, sex, sector (ethnicity), neighborhood, number of influenza vaccines in past 5 years, pregnancy, and number of coexisting conditions.
I’m curious about matching coexisting conditions on the count, which ignores severity or type of condition. This is an "exact" matching criterion that is inexact. An obese individual with healthy lungs can be paired with an individual with normal weight and chronic pulmonary disease, if both have just one coexisting condition (actually one or two but let’s not dwell on that here). I can’t comment further without access to the data; I just want to point this out for any methodology nerds out there.
***
So far, we've learned that the exclusion and matching process used by the Clalit team generated an analysis set that appears to bias younger, lower-risk individuals, thus we expect the vaccine effectiveness estimates to be over-stated. The precise impact of such bias is hard to gauge in real-world studies, which is why researchers prefer randomized trials. The team in Israel faces the unusual challenge of the fast pace of vaccination, leaving them with incurable imbalance. In the next post, I address how they attempted to work around this problem, and the limits to what scientists can do about it. (It's a look behind the movie sets that I promised you.)
These studies demonstrate that the vaccines are effective enough (and also safe enough). As with all real-world observational studies, we must be careful about generalizing the results to larger populations. The point of matching is to define two comparable groups by deleting individuals without appropriate comparators but in doing so, neither group represents the universe from which it is drawn.
Finally, it's important for anyone reading real-world studies to pull up the latest real-world Covid-19 statistics on Israel. Here are the cases and deaths charts from OurWorldinData.org.
The decline in cases has stalled for two weeks, and the daily total is currently 6 times higher than at the start of the current wave. Deaths may also be stalling, and are currently almost three times higher than at the start of the current wave.
Of course, one must not jump to conclusions. The curves are not purely driven by vaccinations.
Comments
You can follow this conversation by subscribing to the comment feed for this post.