Over the weekend, Buzzfeed News dropped a monumental bombshell, reporting on a whistleblower complaint filed with Stanford University about the infamous "Stanford study" - which signaled the start of the antibody testing movement in the United States. I previously wrote about that study here and here. I also explained why antibody testing isn't very useful.
The press coverage of the original result was distasteful from the start. The narrative pushed by the study's co-authors, hitherto respected scholars, was that the number of infections was "50 to 85 times higher" than the number of reported cases, thus the risk of death from Covid-19 was drastically lower than previously estimated, "about on a par with the flu", said one of the co-authors at a press conference. Some politicians and businesspeople immediately seized upon this finding to advance their "herd immunity" program, which is a nice way of saying let the old but not rich folks die.
Even beyond the questionable science of the study, the narrative was misguided. Little do people realize that only 50 people out of over 3,000 tested positive for antibodies in that experiment. Instead of reporting that only 1 to 3 percent of the participants tested positive for the antibodies, the journalists went with the researchers' framing of the number as being 50 to 85 times higher than reported cases. This is a common trick. It's like saying the 20% discount offer resulted in a 50-percent increase in sales, when sales went up from 4 to 6 while 100 sales are needed to break even on the cost of the offer. The result is statistically significant but practically meaningless.
Using this same tactic, one can say that the level of prevalence found in the Stanford study is at least 20 times below what is required to achieve so-called herd immunity. By the way, no scientist is willing to go on air to confirm that having these antibodies confers meaningful "immunity," which is saying a lot.
The argument for lowering the mortality rate is also overly simplistic. Epidemiologists have long distinguished between case mortality rates and infection mortality rates, with the understanding that cases are an undercount of infections. This study is an attempt to estimate the latter. Further, a higher proportion of infected has a couple of implications. First, the transmission rate is much higher, meaning that the reproduction number (the average number of people each infected person infects) must also be higher than previously believed, which ironically raises the bar for reaching herd immunity. Second, the gap between cases and infections reveals the failure of diagnostic testing, which suppresses both case counts and counts of deaths. Adjusting for cases but not for deaths clearly produces a mortality rate that is too low.
Furthermore, comparing the infection fatality rate of influenza to SARS-CoV-2 is apples to oranges. The infection of flu is dampened by the widespread use of the flu vaccine in the U.S. while we have no vaccine for coronavirus yet. This suggests that the base population upon which the fatality rate is applied is much higher for coronavirus so the same rate results in many more deaths.
***
The Buzzfeed investigation provides a lot of color to this antibody-testing movement. Here are the most important moments in the long report.
"I declined authorship on any manuscript... based on our testing" and "I ... end my involvement... without any ties, acknowledged or unacknowledged, to current, revised, or future contemplated preprints, publications or other public presentations of results" How two Stanford researchers disassociated themselves from the Stanford antibody study after performing validation studies of the test's accuracy.
Test kits come with claims of accuracy by their manufacturer. Accuracy is typically provided as sensitivity and specificity, measuring the test's ability to identify positives and negatives respectively. For the antibody test, given the low level of prevalence, the specificity has to be close to 100 percent, meaning all negatives must be correctly identified, for the test to have any utility. Otherwise, the high proportion of antibody-negative people in the population can generate more (false) positive results than the number of antibody-positive people who exist.
Vendors advertise test accuracy based on self-reported validation samples. Validation is hard - and for this reason, vendors can easily cheat to sell lemons. You can't validate the people who took your test because the point of taking your test is to determine who has antibodies. If you knew the answer, you wouldn't need to test them. So to validate an antibody test, you need to find blood samples that are unequivocably positive or negative. You have to purchase blood specimens from people who have them, or pay labs to conduct tests on your behalf on their specimens.
Or, you could take your blood samples, and apply a different test and see if the results match those from your own test.
The test kit used in the Stanford study has not yet been approved by the FDA so the need to validate the accuracy claims was even higher. The research team made at least two attempts to validate the test accuracy at Stanford-affiliated labs. Both groups later declined co-authorship of the study because they had reservations about the test results.
When the first preprint appeared, the specificity of the test was supported by a 30-specimen "independent" validation sample, plus manufacturer's data. The independent validation turned out to be the results from one of the labs that severed ties with the research project.
The murky disclosure surrounding the validation data
Many statisticians, including myself, raised concerns that the research outcome is highly sensitive to the assumption of the test's specificity. For validation, a sample size of 30 is way too small. I computed that a minimum of 300 - with all 300 testing negative - would be required to establish specificity of 99 percent. In the revised preprint, the researchers expediently found 3000 more negative samples to justify an assumption of 99.5% specificity.
The authors said they "conducted additional testing to assess the kit performance, and continued collecting information from assessments of test performance to incorporate into the analysis." They "gathered all available information on test performance characeristics, with a focus on test specificity". In an addendeum, after mentioning the sample of size 30 referenced in the first report, they disclosed receiving additional information from "multiple sources, including the manufacturer's original data, test performance assessments for regulatory documents, and independent evaluations."
While reading all these text, I was scratching my head to figure out whether they conducted additional validation testing at Stanford labs, or they contracted other labs to perform validation testing, or they simply requested data from third parties who just happened to have them.
While they promised information on "provenance" of the data samples, one third of the 3,000 negative samples were simply described as "adults admitted to hospitals pre-COVID19." Which lab did the testing? When and where were these samples collected and for what purpose?
The ambient threat of confirmation bias
This search for samples creates a goal-seek situation for these researchers. They know what specificity is required to reach a scientifically acceptable conclusion. They have targets for how many samples are needed, and what the pooled accuracy rate has to be. We don't know how they conducted this search, and what measures were taken to prevent confirmation bias. If they had received data that took the accuracy rate below the required 99.5%, they could have continued searching for the next sample (or convinced each other that that sample was an outlier). I'm not saying they did this; the dangers affect anyone placed in this situation.
Well, well, well. There is a hint of impaired judgment. The researchers chose to publish their first preprint the day before receiving validation results from the second independent lab. By that time, the lab already expressed concern about false positives. The additional results came from re-testing the positives using ELISA, a "gold standard" test for antibodies. They showed that "a little over half" of those positives were confirmed. The just below half are false positives.
This validation sample showing low sensitivity was not included in the revised preprint while the other validation sample showing 100% specificity continued to feature. Both samples came from Stanford-affiliated labs that decided to decline authorship based on concerns about test accuracy. It's possible that the one lab did not authorize the use of their data. But this situation exactly illustrates the risk of "publication bias": whether the decision to exclude was made by the independent lab or the research team, the act of exclusion biased the statistical analysis.
Ironically, the Stanford team includes John Ioannidis, who has led an outcry against biased scientific findings.
"If you are willing to do a 5,000 test in New York, just tell me the cost and I will raise the money immediately." Jetblue CEO to Stanford researcher who was recruited to validate the test kit.
Jetblue's founder and CEO, David Neeleman, holds strong views on re-opening the economy based on the theory that a large number of people have been infected and are thus immune. It turned out he was a funder of the Stanford study.
The various co-authors of the study told Buzzfeed that they either had no knowledge of where the funding came from, or that there could have been multiple funders with different points of view (while maintaining they don't know who the funders are), or that their research could simply not be influenced by funding.
Neeleman now said that he was not shown results prior to the publication but in an op-ed 10 days before the publication, he wrote that the Stanford team "believe that the actual number of cases is very likely off by an order of magnitude of 10, or maybe even many times more". A few days before publication, he appeared on a Fox News show pitching a re-opening strategy, alongside with two of the researchers, who presumably still didn't know who funded the study.
One of the researchers suggested to Neeleman that he should "write a note [to the independent researcher] telling her you'll support her lab if she validates this kit." That's how the proposal for a large-scale New York test came into being. This researcher presumably didn't know Neeleman funded their study.
I noticed another discrepancy when it comes to funding sources: in the preprints, the researchers disclosed they "purchased" test kits from Premier Biotech but in one of the emails cited by Buzzfeed, the second researcher tasked with validating the test kit wrote "we are concerned about the specificity of the Premier Biotech devices that were donated for your study." The researcher could have been confused but if he was right, the "conflict of interest" section of the preprint should be corrected.
"If they had just done the New York study first, there wouldn’t be so much scrutiny." Jetblue CEO
Neeleman elaborated: "Unfortunately PR [public relations] impact and the ability to raise large amounts of money quickly will not be the same if you announce 1% of Santa Clara County tested positive for the antibodies versus 30% of New Yorkers which would be huge news." This email was dated before the publication of the first preprint, and way before New York State started antibody testing. Somehow, Neeleman knew (a) that the Santa Clara study would show about 1 percent positive even though the research team claimed he and no others were informed of results beforehand; and (b) that the New York proportion would be 15 times higher than reported (in the right ballpark).
Neeleman has been proven correct, as the New York result made quite a splash. As readers know, I have been complaining for weeks that New York Governor Cuomo - who is an avowed believer in science - has been promoting day after day after day results from an antibody testing program in his press briefings, while one quickly discovers that his Powerpoint slides are the sole source of data about this study.
Given what we now know about the Stanford study, I call once more for transparency. The State Department of Health, which is said to have conducted the antibody testing program, needs to come forward with experts, preprints and data. The citizens are entitled to know: who funded the study? What are the sensitivity and specificity claimed by the test designer? How did they validate their test kit? What analysis was done? What adjustments were made? They need to answer the same questions the Stanford team did.
The Times Union Albany reporters dared to ask questions, and everywhere they went, Department of Health officials claimed no knowledge of anything about this state-run program. I hope the New York Times is on this case.
What NY State did is much worse than the Stanford team. At least the Stanford researchers put information out there and responded to comments. New York has been feeding countless headlines, proclaiming 20 percent prevalence in NYC, when it has offered zero support for this.
The Diamond Princess puts an upper limit on the number of cases that aren't found given that there were about 700 cases for 3700 passengers. The South Koreans did a massive amount of testing for their initial outbreak and found less than one asymptotic case per symptomatic. I think other figures are about 30% asymptomatic.
Posted by: Ken | 05/31/2020 at 07:28 PM