The vaccine trials are slipping into parody with the way the scientific results are being dangled over a hungry press. It's clear that the news is great but let's not pump irrational exuberance over magical precision.
The most responsible thing to say at this point is that we have multiple highly promising vaccine candidates that have self-reported short-term efficiacy up to about 90 percent; if a vast majority of the world gets vaccinated, the world may successfully defeat the coronavirus.
***
When reading vaccine trial results, bear in mind the following:
- No clinical trial can yield an efficacy estimate as precise as 90% or 95% or 94.5%. When a statistician say 90%, we mean 90% around a margin of error. To have highly precise estimates, we must have tiny margins of error. (See also end note.)
- The precision of a statistical estimate depends on the sample size. The larger the sample size, the more precise is the estimate. In accelerated vaccine trials, decisions are being made based on only 100-150 cases, so the margins of error are not tiny.
- The difference between 90% and 95% is meaningless. The difference between 94.5% and 95% is especially pointless; it may win PR points but it suggests lack of seriousness to statisticians.
- Participants in any clinical trial do not constitute a random selection from the entire population - there are self-selection bias and exclusion criteria. The trial results are extrapolated to the entire population, and that's another reason the efficacy estimates are imprecise.
- Subgroup analysis is shaky at this stage. When filtered to subgroups, the sample sizes get even smaller, and the error bars are wider around the reported number. For example, Pfizer stated efficacy for people over 65 was 94%. This number is essentially the same as the average efficacy for all age groups. But the error bar for the over-65 people is much wider than the error bar for the average participant.
- Press releases may highlight particular effectiveness for some subgroup; remember that this also means it is less effective for some other subgroup. AstraZeneca, for example, disclosed 90% efficacy for the 1.5-dose treatment. SInce the average efficacy was 70%, we know that the efficacy for the 2-dose treatment must be lower than 70%.
All three vaccines - Pfizer, Moderna, AstraZeneca - look promising from the short-term efficacy angle (even 60% isn't a bad result). It's hard to say more until the scientific reports are published. Efficacy or other health metrics are not the only consideration; cost, ease of transportation and storage, the schedules of availability, quantities, etc. - very practical matters - also must be evaluated.
***
The AstraZeneca result that just came out contains a mystery. In the design protocol, there isn't a 1.5-dose treatment so where did that come from? What the researchers appear to be saying is that an execution error - affecting a quarter of participants receiving the vaccine shots - resulted in a lower-than-expected first dose so the vaccine arm were split into two subgroups for analysis.
Most reporters have not noticed this abberation, or if they did, they didn't think it is material. This accident is immaterial only if the execution error affected a random subset of participants. In a clinical trial, we use randomization to ensure the two arms differ only by whether participants receive the vaccine or the placebo. If a third arm is added, it too must be randomly assigned. In the AstraZeneca/Oxford trial, we know the vaccine vs. placebo assignment is explicitly randomized but not so the split between 1.5 and 2 doses within the vaccine arm.
Execution errors sometimes have systematic causes, and may not be random. It's always risky to assume random, as I explained in this post.
What will happen is the researchers will give a story explaining the execution error in the scientific report, and argue why they can assume the errors were randomly scattered among the vaccine arm. This will be hard to prove. One common analysis is to show that the demographic and prior health characteristics of the group that received 1.5 doses are statistically identical to those of the group that received 2 doses. This is necessary but not sufficient; nevertheless, it will be considered good enough.
For those of us who run lots of testing, our worst nightmare is to have execution errors. What happens now is you have to trust the people, instead of trusting the (pre-registered) process. You have to trust our explanations about the errors, and the analyses justifying the assumption of random error.
I'm also curious how cutting the sample size by 75% has not affected the statistical significance of the result. That will come out in the scientific report.
That said, the average efficacy of about 70 percent is still a good result. I don't want you to think AstraZeneca's vaccine is bad. In fact, it has several advantages in terms of cost, storage, and so on.
P.S. The relationship between uncertainty and precision may not feel intuitive. The following probability curve is reproduced from my prior post. This curve shows the probability of vaccine efficacy ("posterior", in the sense of after looking at the data from the clinical trial).
The peak of this curve shows that the most likely value of vaccine efficacy is about 91% based on 8 cases in the vaccine group among a total of 94 cases.
The curve also shows we believe the vaccine efficacy to have a value between (roughly) 70% and 100%. Since the total probability of all possible values is 1, the total area under the curve between 70% and 100% should be exactly 1.
If you draw a vertical line under the peak of 91%, the area of that line is zero (height x 0 = 0). In other words, the probability of a precise estimate is zero. A statement such as efficacy is 91% has zero probability. But we can make other statements, e.g. efficacy is between 90% and 92%: the probability of this is the area under the curve between the values of 90% and 92%. This has positive probability.
The wider the margin of error, the larger the area under the curve, the higher the probability.
We can debate whether Covid 19 is 1 or 2 or 3 times as dangerous as Flu.
But a vaccine is not a big deal in terms of saving lives, unlike say getting hospitals to do as many early cancer tests as before.
What is it essential for is for allowing Governments to walk back all the daft and damaging restrictions of the past year and getting Economies back to normal and getting lifestyles back to normal.
It is the excuse Governments (and media) need to undo all that damage without having to say Sorry.
Posted by: Michael Droy | 11/24/2020 at 09:31 AM
The difference between the vaccine and placebo is so great that the p value is going to be very small, whether they include or exclude some patients. The standard analysis in a clinical trial is intention to treat, so AstraZeneca should be analysing as randomised. It seems that it will be very difficult to determine any difference between the vaccines, with usage coming down to cost, side effects, delivery costs and availability.
Posted by: Ken | 11/28/2020 at 04:08 AM
Ken: I have not checked the AstraZeneca protocol but the Moderna protocol specifically stated that the per-protocol analysis is considered primary, and the ITT analysis is check off the list. The language of the Pfizer protocol is less clear but I believe they also will primarily use per-protocol. Analyzing as random is a good thing but not as good when it is known that errors have affected the treatment dosage on a subset of the participants. Exclusions are important to look at because the numbers here are such that the difference between stopping for efficacy and stopping for failure is about 10 cases. Unfortunately, outsiders will probably not know whether any of the exclusions are infections. I hope they report whether anyone got infected between doses because those are excluded from the second dose.
Posted by: junkcharts | 11/30/2020 at 11:38 AM