In my early March preview of statistical issues relating to coronavirus, I mentioned "people are counted as infected only if they are tested", and I also asked "how are deaths confirmed?"
For a while, the media ignored how meaningless the case counts were. Eventually, they got it, and that was great. But some experts convinced them that the savior is the death count. And now, in the last week, the reporters finally learned that death counts are subject to similar problems of definition.
The Economist just launched a tracker to look at excess all-cause mortality, using data from EuroMOMO. If during this pandemic, the total number of deaths from any cause is higher than normal, then it provides evidence (though not irrefutable) that Covid-19 is not a new name for influenza. Of course, we are finding excess deaths everywhere.
The next puzzle is that the number of deaths attributed to Covid-19 on death certificates is frequently lower than the excess deaths from all causes. This raises the possibility that Covid-19 deaths are under-reported. By itself, this is not necessarily true, because it is also possible that deaths from other causes are being under-counted.
However, it has also emerged that in many countries and regions, the published counts of fatalities so far have only included deaths occurring at hospitals for whom the deceased had returned a positive test for the virus. A good number of deaths occurred in care homes, hospices, and even private residences, and these have been ignored. In fact, many hospitals refused to admit patients who do not report severe symptoms, and told them to isolate at home.
The excess all-cause mortality data take a bit more time to collect, and perhaps not by accident, as these statistics come to light, many countries announced that they are revising how they count Covid-19 deaths, essentially admitting they were under-reporting.
***
The lesson related to data integrity is being taken up very slowly. Even as the Economist published the valuable tracker of all-cause mortality, they wrote:
... people have become grimly familiar with the death tolls... these numbers give a better indicator of a country's trajectory than do counts of confirmed cases, which largely measure how many people have been tested. Nonetheless, official covid-19 death tolls still under-count...
This is better than most but still repeats several fallacies:
a) Given the unknown fatality rate of Covid-19, the death count is a proportion of the case count. If data integrity is not at issue, the death count is a worse indicator of a trajectory than the case count because it just gives the same information after a time delay. If death tolls are better than confirmed cases, it's because we believe the data on deaths are less bad than the data on cases.
b) The emerging evidence shows that the unadjusted death counts are wildly inaccurate, so I'm not sure you can say the data on deaths are "the lesser evil". In Lombardy, the Economist estimated the "true toll was about 120% higher" than the confirmed deaths. In Spain, the true number might be 60% higher (off a much larger base).
c) Confirmed cases is not a simple function of how many people got tested. More importantly, it is measuring who got tested. I know I have been hounding this point to death. In statistics, we care not only about the size of the sample but the composition of the sample! We could test lot of people with no symptoms and still have few confirmed cases.
d) The under-counting of deaths is still partially due to lack of testing. It is true that Covid-19 testing is not required in those all-cause mortality statistics but without testing, the count of deaths is clearly less reliable, just like the count of cases.
e) Deaths can also be over-counted, especially without testing to confirm coronavirus. How well can a doctor tell if the patient who died of pnuemonia had Covid-19 or some other illness?
f) If it's broke, you can't fix it. Countries change the definition of deaths mid-way but won't (and can't) change the past. This renders the death count series useless, unless you're willing to adjust the data. This was a key point I made about new metrics of obesity in Chapter 2 of Numbersense (link).
***
So, if you've been reading this blog, you know interpretation of statistics is no simple business.
While compiling these excess mortality numbers is necessary and important, here is something you should know, and I'm not going to sugarcoat it.
Imagine wearing glasses or sunglasses in the summer. You walk into an air-conditioned shop. You must wait for the fog to clear.
These excess mortality numbers are shrouded in a fog right now. It is actually somewhat meaningless to discuss them. This is because of a substitution effect. Some of the excess deaths we're seeing are accelerated deaths. (Of course, all excess deaths are accelerated deaths.) What proportion of the excess deaths are relevant will be up for debate once we wait long enough to see clearly.
A store usually runs an annual sale on New Year's Day only. This year, the store starts the discounts on the day before. After New Year's Eve, the store reports 50% jump in sales versus last year ("excess"). Was the promotion particularly effective this year? You don't know until after New Year's Day. When you compare the total sales of both days against the two days last year, you'll find that the jump is less than 50%. Some customers just showed up one day earlier.
Comments
You can follow this conversation by subscribing to the comment feed for this post.