The unexpected turns in the pandemic have exposed which data analysts are serious, and who aren't. At the current moment, anyone who proclaims that vaccines are the cause of the dramatic receding of the coronavirus has proven to be not serious. They are analysts who embrace every nugget of confirming data while ignoring inconvenient observations.
What are some of these inconvenient data points?
- The number of cases has come down drastically in many places, including many that have not yet rolled out their vaccination campaigns. (e.g. India)
- The level of decline in many places far exceeds what is a reasonable expectation from the data observed in the vaccine trials. In almost every place, the vast majority of the population have not been vaccinated.
- Some organizations have issued deceptive analyses which the media should have flagged as potential misinformation. The much-reported analysis by the largest healthcare provider in Israel compared what they defined as effectiveness measured from second shot (2D) to 2D+7 days to the VE number from the Pfizer trial, which counted cases from 2D+7 days to a median two months later. In other words, the case-counting window of those two numbers do not overlap for even a single day! Talk to us in a month or so.
- If these providers are tracking all those they have vaccinated over time, they should publish all the data (from the first shot) as opposed to selecting a few days. That way, we can access how different the real-world experience is from what happened during the trial. (To be fair, I expect those two to differ because we're not measuring the same thing. The idea that those two should match perfectly is bonkers.)
It may be hard to imagine how complicated it is to get an estimate of real-world vaccine effectiveness. In a clinical trial, we have access to a cohort of people who are given two placebo shots and are being followed up alongside the vaccinated cohort. We don't have a placebo cohort in real life.
An unvaccinated person is not the same as someone who's been given saline shots. It's long been known that the placebo effect is real - you can give people sugar pills and some of them will get better. This means any real-world measurement is over-optimistic.
**Now I'm grabbing you with me while I jump into the deep end
We've identified a group of people who are unvaccinated. We even go one step further and match these people's characteristics to those who have been vaccinated. This means the age-group distribution is statistically the same across both vaccinated and unvaccinated groups, and so on. Remember the Israeli healthcare provider counted cases between 2D (second shot) and 2D+7.
Here's the catch. When is 2D in the unvaccinated group? You know what, we cannot define 2D because none of these people got any shots! In fact, only those who are symptomatic and get tested will show up at a facility - the rest are just people in databases with a "vaccination = No" flag. Take the unvaccinated person who just tested positive. Where in this timeline did the person test positive? How do we know the timing of this case is within that 2D to 2D+7 window?
The answer is we don't. It is not even possible to define this. Depending on what we define as Day 0, this person can be counted in any number of analyses. This is yet another reason why we need to look at longer time windows. Any analysis based on short time windows is junk.
***
One way to get around this issue is to match the timing. Take each vaccinated individual in the analysis population, find a "double" via demographic matching in the unvaccinated group, exclude anyone who has reported symptoms by Day 0 as defined by the vaccinated group, and then, mark off the corresponding 2D to 2D+7 window for each matched individual. As per protocol, if the unvaccinated double gets infected before 2D, that case cannot count. (If you don't narrow the case-counting window for the unvaccinated, then the analysis is hopelessly favorable to the vaccinated group!)
The key lesson is if you don't have experimental data, the data analysis gets a lot more complex (more stimulating, but also fraught with risk). The analyst must release full details of the analysis; otherwise, it's impossible to evaluate its merit.
Comments
You can follow this conversation by subscribing to the comment feed for this post.