Washington Post devoted space to some more sloppy medical research last week when it printed an article about vaccinations for various ailments providing protection for Alzheimer's disease/dementia (link).
Without even reading the details, I already know this is not a randomized clinical trial (RCT) but another observational study based on analyzing databases. Having seen many sloppy observational studies during the Covid-19 pandemic, I was not surprised to find that this study is also fraught with fundamental problems.
The bar for proving cause-effect in observational studies is extremely high for the simple reason that there are too many possible explanations for any observed effects, other than the treatment under study. The RCT setup helps eliminate alternative explanations.
The key study being cited for this connection between vaccines and Alzheimer's is this one. The study uses the propensity score matching methodology to construct pseudo treatment and control groups. The pseudo groups are then compared based on their Alzheimer's case counts. I previously reviewed studies using propensity scoring here.
The researchers performed three separate analyses for three different vaccines, and reported the following results:
- Tdap/Td vaccination (tetanus, diphtheria): ~70% reduction in Alzheimer's risk, 7% of vaccinated vs 10% of unvaccinated
- HZ (herpes zoster/Shingles): ~75% reduction in risk, 8% of vaccinated vs 11% of unvaccinated
- pneumococcus: ~73% reduction in risk, 8% of vaccinated vs 11% of unvaccinated
[This research team has another study with similar results for flu vaccines.]
***
The setup of the observational study analysis is flawed, as it considers the three vaccinations as independent events. In reality, these three treatments are connected in two ways: first, it's likely that people who get one vaccine are more likely to have gotten the others; second, if each vaccine conferred benefits, what is the aggregate benefit of having had three vaccines?
Think about that for a moment. Let's go from top to bottom down the list of benefits. For someone without Tdap/Td vaccination, the chance of developing Alzheimer's was 10%; if the person had the Tdap/Td vaccine, the risk drops to 7%. What if the person also had the HZ vaccine?
The problem we run into is that the baseline risk for HZ unvaccinated is shown as 11%. Now, this group includes Tdap/Td vaccinated whom we expected to have a risk of 7% (not 11%). What is the risk reduction of HZ vaccination on top of Tdap/Td? The second listed benefit does not help.
If we simply apply the 75% risk reduction, then the new risk is 25%*7% = 1.8%. What did I just do there? I just made an assumption - that the risk reduction is 75% whether or not the person got the Tdap/Td vaccine. In other words, I assumed that the effects of Tdap/Td vaccination and of HZ vaccination on Alzheimer's are independent. Is this assumption justified? Absolutely not.
But let's persist. We now arrive at the third listed item. Again, the baseline risk is said to be 11%, obviously not for the group of people who had both the Tdap/Td and the HZ vaccines. We again assert independence. Now, the aggregate Alzheimer's risk of having had all three vaccines is 27%*1.8% = 0.5%.
Merely by invoking magical independence assumptions, I have shown the preposterous result that vaccinations would effectively wipe out Alzheimer's.
***
Can other researchers verify the results? Most certainly not. Under "data availability", the authors said they "cannot make data and study materials available to other investigators due to licensing restriction". They even cheekily added "interested parties can license the CDM by contacting Optum." So they are saying if you want to check our work, you have to pay up.
The remainder of this post is necessarily speculative because I'm not paying up.
***
The researchers made a false start in trying to deal with this issue of multiple, correlated treatments. In constructing the comparison groups, they not only used the typical matching variables such as demographics and comorbidities, but they also included vaccination statuses. For example, for the analysis of the Tdap/Td vaccination, they included vaccination status of HZ and of pneumococcus as matching variables.
For propensity matching, the goal is to ensure that the two groups being compared were as close to identical as possible before follow-up. Therefore, I'm going to assume that the vaccination statuses being measured are those prior to the follow-up period of the study.
After matching, one group is labeled with Tdap/Td vaccination (during follow-up), and the control group is without Tdap/Td vaccination (during follow-up). These two groups are statistically identical on all the matching variables. Then, the researchers counted the number of Alzheimer's cases in each group during the follow-up period. That's how they got 7% and 10%, leading to an effectiveness of ~70%.
What about people getting vaccinated for HZ and/or pneumococcus during the follow-up period? The fact that the two groups started out with similar proportions of vaccinations for those two conditions does not imply that they would continue to maintain balance on such vaccinations! In fact, the balance is most likely lost during follow-up. Why? One key difference between the two groups during follow-up is that one group got the Tdap/Td vaccine while the other didn't. To the extent that there are correlations between vaccinations of different types, we expect the Tdap/Td vaccinated group to over-represent those who got one or both of the other vaccinations.
Given that those other vaccinations are also claimed to reduce Alzheimer's risk, now you have a hopelessly confounded situation. This research study has certainly not proven its thesis. We simply do not know whether the reported effect of Tdap/Td vaccination included the effects of the other two vaccinations.
***
Can we use the vaccination status for the entire study period during propensity matching, rather than just the vaccination status prior to the follow-up period?
This gets us into a different kind of methodological hell. The design of the comparison groups now requires data from the "future". In addition, these two variables from the future are believed to be highly correlated with the study's outcomes. So effectively, we would have used future knowledge of the study's outcome in selecting subjects into comparison groups. Definitely not a good practice.
***
Table 1 and Supplementary Table 2 provide some more details about the counts of people in the study. It appears that 1.6 million people were under consideration for matching. Under Table 1, prior to matching, the researchers reported 123K people who had the Tdap/Td vaccination during the follow-up period. In addition, they said 20K+3K = 23K people had HZ vaccination (aggregate of both Tdap/Td vaccinated and unvaccinated).
But then in Supplementary Table 2, the number of people with HZ vaccination during follow-up was 212K. This number is way larger that the 23K from Table 1. This is why I interpreted the 23K as those who got the HZ vaccination prior to the follow-up period.
Similar patterns hold for all three tables.
***
TLDR;
Causal inference is very hard. I have not seen evidence that medical researchers have the right skills to do it well.
Comments
You can follow this conversation by subscribing to the comment feed for this post.