We're drowning in a quicksand of bad-quality data during this pandemic, and frustratingly, the situation is getting worse, as we enter the second year. It would have been a challenge to measure the real-world impact of vaccines because of the mass vaccination initiatives. We then made our lives much harder by (a) destroying the placebo groups in the original randomized clinical trials, fossilizing them at 4 months of follow-up (see this post); and (b) applying different rules of masking, testing, hospitalizing, attributing causes of death, etc. to vaccinated and unvaccinated subgroups, literally injecting massive biases into observational data.
It comes as an unexpected gift to find a study that takes on these challenges seriously. Today, I review the preprint of the REACT-1 Study from the U.K. There is a large team of people involved, many of whom from Imperial College.
***
In this study, the researchers collect swabs from randomly selected people from all over England. The study is ongoing, with monthly cohorts. The report I'm reading covers Round 13, which was conducted from mid June to mid July. This cohort is monumental because Round 13 spanned a critical point in the case curve in the UK.

In the period leading to Round 13, the U.K. government was congratulating itself on the hugely successful vaccination campaign, about to declare mission accomplished. But by July, the virus has returned with a new rigor. In Round 13, the researchers analyzed about 100K swabs, with a positivity rate of 0.6%. That is more than four times higher than the positivity rate in Round 12.
The positivity rate found in the REACT-1 Study - especially after adjustments and weighting - can be generalized to the population of England. That's because the study utilizes a random sample. The swab tests find not just symptomatic but also asymptomatic Covid-19 cases. This study gives much better information than most real-world studies - which usually tap into a database of positive test results. A severe problem with those studies is selection bias. Vaccinated people, people with mild symptoms or disease, and people who live in lower-exposure areas are less likely to get tested, so their cases are under-counted. (That said, the "good" study still has to deal with biases, which I'll get to later in this post.)
***
There are many things we can learn from the REACT-1 study. The first thing is that the chance of anyone catching the virus - regardless of vaccination status - is low. Only 6 out of 1,000 people in England were actively sick with Covid-19 during July. The chance of getting severe Covid-19, getting admitted into hospitals, or dying from Covid-19 is a fraction of that number. The danger comes from the spread of the virus, so that a large base of the infected produces a large number of deaths. (Even at its peak in England - Round 8, the prevalence was 1.6%.)
What about the 994 people out of 1,000 who did not get sick during Round 13? We are frequently misled into thinking that the fully vaccinated subset were protected by the vaccine. The fallacy is made clear when you ask about those who are unvaccinated. One can't get infected if one isn't exposed. A lot of these people - vaccinated or not - did not get exposed to the virus during the study period. Most of the calculations you see out there implicitly assume that everyone is exposed every month, month after month.
I am not saying none of those 994 people got exposed. Some proportion of those were not infected because they did not get exposed. This is a missing data problem that we haven't grappled with yet. If they do get exposed, it's likely that the vaccinated ones have better outcomes; better may mean 30-50% reduction in chance of getting sick; we now know it's not anywhere close to 90%.
***
In "explaining" the surge in cases in England in July, the study's authors pointed the finger at unvaccinated 13-17 year olds. They drew this conclusion by comparing the percentage increase in infection rate between Rounds 12 and 13 by age group. See this chart (with my edits):

The study estimates that the prevalence of Covid-19 in the 13-17 age group jumped from 0.2% to 1.6%, which is a nine-fold surge, compared to an aggregate surge of about 4 times.
Are the teenagers responsible for the entire surge? To answer this question, I looked up the age distribution of England's population (shown on the chart in the green boxes).
From this analysis, I found that 13-17 year olds accounted for 15% share of cases in this period of time. The bulk (60%) of cases affected those between 18 and 64.

The above analysis contains one hopeful insight. The 65+ age group, which enjoys the highest vaccination rates, is getting some measure of protection, as indicated by the lower share of infection. This goes a long way to explaining why deaths in England have not increased as much as infections - younger people are getting infected, and they tend to recover more readily.
(Statements about deaths must be interpreted with great caution. This study is typical: the cutoff date of Round 13 was in the middle of the surge of cases, which means the analysis covered a period of time when the deaths arising from the surge of cases have not occurred yet. The researchers would never return to update the analysis and so we are always dealing with over-optimistic, premature estimates of effectiveness against deaths.)
***
Next, I made a similar analysis breaking the study population down by vaccination status: fully vaccinated, partly vaccinated and unvaccinated.
This leads to a surprising finding that the surge in infections was even stronger among those who were vaccinated (at least one dose) than those who were unvaccinated. This problem illustrates the difficulty of observational datasets. There are so many possible explanations for this. My own favorite is that many vaccinated people have stopped taking precautions.

***
Another nice feature of the REACT-1 preprint is the careful description of how they interpreted test results. It's not a straightforward science - something I explained in Chapter 4 of Numbers Rule Your World (link) using testing for steriods and HGH as a case study. Someone has to draw a line on some continuous scale to decide who's positive and who's negative.
These UK researchers described their positivity criteria as "both E and N gene targets were detected, or if N gene was detected with CT value < 37". CT is cycle threshold: the lower the CT value, the higher the viral load.
I highly recommend reading this section (pp. 10-11) of the paper. You will learn that to figure out what positivity threshold to use, the scientists need to procure positive and negative controls (samples known with 100% certainty to be positive or negative), these controls may be synthetically created, experimenters may be blinded or unblinded, testing analysis may be done externally or internally, multiple experiments by multiple labs may be required, these results may not agree, etc. In other words, scientists don't wave their hands and write down CT < 37; there is method to the madness.
Unfortunately, there does not seem to be a global standard for what test to use and what positivity threshold to apply. This is yet another aspect of the data quality problem I mentioned up top.
***
While the selected participants represent the whole of England, the study cannot escape selection bias in the form of non-response. The researchers reported that response rates (of sending back the swabs) have declined steadily and was at 12% in Round 3. That's a very respectable number.
What's concerning is that the responders were clearly different from the non-responders. Response rates for younger people and people living in deprived neighborhoods were only in the 5% range. Those, of course, are also the least vaccinated subgroups.
As if to prove the point that biases pervade every corner of real-world data, the researchers also reported finding selection bias among those who gave them permission to link their NHS records. Ex ante, it is not obvious why there should be a difference in infection rates between those who gave consent and those who didn't.
It turns out vaccinated people who gave consent are more likely to get infected than vaccinated people who didn't give consent. The reverse is true for unvaccinated people.
(Linking is used to conduct off-book accounting as described in this post. This process disappears vaccinated, and infected people because of some arbitrary case-counting window - which cannot be applied to the unvaccinated. In the case of England, the 2D+14 window means that no cases are counted on the vaccinated side until at least 14 weeks after the first shot because of a long delay between doses, based on embarrassingly wrong analyses by advisors to the UK government. This gimmick puts our most creative corporate accountants to shame.)
***
We are drowning in bad-quality data. Don't trust any study that gets published. Be discriminating, and value those researchers who are making an honest effort at doing good analyses. It takes a lot of work to properly handle real-world data, so let's take a moment to appreciate those who tackle this task seriously.
Recent Comments