From Andrew's blog, I learned about Tom Lumley's post that provides insight into the accuracy of the two types of Covid-19 tests. Tom is a biostatistician.
As you probably know by now, there is a diagnostic test, the most prominent of which is called the (RT-)PCR test, that involves using a swab to get sputum from the nose or throat. This material is taken to a lab which looks for the presence of SARS-CoV-2, the name for the virus that causes Covid-19 the disease. This test identifies who currently are infected.
Then, there is an antibody test, which looks for antibodies that are found in the blood of infected people, as the body fights against the virus. This test does not detect the presence of the virus; it looks for a sign that it passed through.
These two tests are not interchangeable. Some politicians on both sides of the Atlantic try to cover up the slow rollout of diagnostic testing by talking up antibody testing. In fact, early reports from antibody testing program such as the Stanford study found less than 5 percent have antibodies which means the virus is not as prevalent as claimed. (Here's why many statisticians are dismissing the Stanford study.)
It might seem like a simple issue: is the virus/antibody there or not? In both cases, if someone were not infected, their sputum or blood shouldn't have that specific virus or antibody in it. Why can't tests be perfect?
***
Let's first consider the antibody test. Antibodies are generated by our bodies naturally to fight an antigen. According to Tom, the antibody to SARS-CoV-2 is not unique - it varies from person to person slightly. This means that any test needs to look for a "range" of antibodies. To complicate matters, antibodies to similar viruses are also similar. Thus, the test will not find the exact target antibody but something similar. Then, the test has to decide whether the difference is due to different individuals or different (but similar) antigens.
One should hope that the person-to-person variability is smaller than the antigen-to-antigen variability. A test would then set up a "range" of outcomes that qualify a positive finding. If the measurement is near the edge of this range, there is a potential for error.
If the acceptance range is too "tight", the test might mistakenly decide it found antibodies for a different antigen when it's just a different individual. This produces false negative results. If acceptance is too "lax", it might mistakenly conclude that an antibody for a similar infection is an antibody for SAR-CoV-2 for a different individual. This produces false positive results.
There is a trade-off between these two errors, controlled by how the test developer sets the acceptance "range". For the antibody test, doctors don't like false positives. These are people who don't have antibodies but are told they do. As I discussed the other day, because the vast majority of people don't have antibodies, even a tiny false-positive rate will result in a lot of people being told they have antibodies when they don't.
The ability to correctly identify people without antibodies is called specificity. Some researchers are assuming antibody tests could be 100% specific. I think that is overly optimistic.
***
The PCR diagnostic test illustrates another aspect of testing accuracy. Sometimes, the error arises externally. In Chapter 4 of Numbers Rule Your World (link), I demonstrate this using steroids tests. A false-negative result arises when the lab mistakenly concludes the amount of chemicals in the doper's urine is not high enough to warrant a red flag. That's a "chemical" false negative.
But a false-negative result is really just any doper passing an anti-doping test. Someone might submit a fake urine sample, or hide from inspectors. In one case, the fake sample will not test positive, and in the other, there will not be a test. Both should be considered false negative cases even though the testing lab did not make a mistake.
Tom says "it is almost impossible to get a positive [PCR test] result without any virus". But low false-positives usually mean higher false-negative results. One reason for false negative is testing at the wrong time, or failing to swab properly. Bad timing is particularly important because "if you get tested too early, there might not be enough virus, and if you get tested too late the infection might have relocated to your chest."
Timing is also important for antibody testing because "a positive reaction [from one's body to the virus] takes time — at least a few days, maybe a week — and it stays around at least a short time after you recover."
Timing is an external factor. The consequent false negative error has nothing to do with the testing lab. Even if the test itself is perfect, there will still be people walking around with the false security of a negative test result.
So errors from testing occur because the test itself cannot be perfect (e.g. antibodies against similar infections may look like antibodies against the same virus coming from different individuals), and even when the test is close to perfect, other factors outside the lab, such as bad timing, cause errors.
***
One final comment on the politics. At various times, some officials argue that the rollout of diagnostic testing has been slow because tests are not accurate enough. Don't believe a word of that.
Let's say 5 percent of the population are infected. Borrowing some numbers from Tom, I assume that the diagnostic test has 35% false negative rate and 2% false positive rate. For every 1,000 (random) people given the test, there are 50 infected and 950 not infected. Of the 50 infected, 65% or 33 will get a positive finding and 17 will falsely believe they don't have the virus. Of the 950 not infected, 930 are told they don't have the virus correctly, with 20 told they have it when they don't. In total, 930+33 = 963 out of 1,000 will get the right answer.
So what the politicians are really saying when they reject this test is that having 17 out of 1,000 people falsely believing they don't have the virus thus spreading it is reason enough to deny 963 out of 1,000 knowing the correct answer. For every false positive, the test identifies two true positives, who can be isolated. It doesn't stem the spread completely but it does slow it down. Given that these politicians are all in for "flattening the curve", they should be clamouring for more testing.
Would the time factor make a difference.
On any given day x % could test positive and y% negative
Which and what percentage should you test every day?
If you repeat every day what happens re. Errors?
Posted by: Time | 04/24/2020 at 06:12 PM
Time: If there is a reason why when someone tests affects the test result, then that factor has to be accounted for. But that doesn't seem like it applies here. Why are you thinking in that direction? (If you're talking about reporting time versus testing time, that's a different issue.)
Posted by: Kaiser | 04/25/2020 at 02:40 AM
Let's leave the time aspect for now. Lets take your example at end and think of it applied 5 times to a manufacturing operation. 5 QC tests to eliminate errors before final output. If we start w1000 units what would be tge number of units at end falling into each of your categories
Posted by: Time | 04/25/2020 at 08:32 AM
Time: Are you suggesting that another objection to the "test is inaccurate" excuse is that you can do repeated testing?
Posted by: Kaiser | 04/26/2020 at 01:17 AM
Not really. But the likely required administration of testing, and or possible pattern of testing in certain environments will likely lead to repetition. If that is true (and it opens the debate) it would mean that accuracy would also need to measured as part of a process.
(Note some manufacturers state their tests are to be used this manner)
Not challenging your statement re politicians and standalone accuracy.
The e.g given more might apply to someone testing a set group over a period of time.
Posted by: Time | 04/26/2020 at 04:39 PM
This will fill some testing gaps https://doi.org/10.1101/2020.04.21.20068858 for
Posted by: Time | 05/01/2020 at 03:42 PM
Time: thanks for that paper. I only read the abstract and will read the whole paper once I find some time. This study is very valuable, and its value lies in generalizing to the population. It seems like they should bring on a statistician to help generalize the data. From the abstract, it does not appear that they attempted to measure selection bias.
Posted by: Kaiser | 05/03/2020 at 11:45 AM