The New York Times published a very nice presentation on "excess deaths" in the United States from March through July. (link)
The following chart shows how public health policies utterly failed in many Northeastern states in the April and May.
In Connecticut, for example, in the middle of April, the number of deaths exceeded expected deaths by 2.6 times. In Massachusetts, around the same time, actual deaths exceeded expectation by 2.3 times.
A lot of work went into generating these charts - starting with meticulous data collection from each state, and as much as possible, reconciling how metrics are defined differently across states. Then, there is a bit of statistical modeling.
Modeling is required because "excess deaths" is - like most statistical quantities - not directly observable. Excess deaths is the difference between actual deaths (from all causes) in the presence of the Covid-19 pandemic and expected deaths (also from all causes) in its absence. The former is directly measurable but the latter is not.
To establish how many people could have died during a specific week in Connecticut if the novel coronavirus did not exist, statisticians look at deaths during the same week in that state in past years. Over a number of years, one should find some statistical consistency in the death rate. This illustrates the power of averaging.
If 2020 were a normal year, the chart should look something like this:
The above line is for Hawaii, which, because of its remote location, has not been affected much thus far by Covid-19. What you see is the actual weekly deaths hovering around the baseline (corresponding to the historical average), sometimes rising above, and sometimes dipping below.
The line for Connecticut, though, looks nothing like that for Hawaii. The most important feature is that the entire line sits above the baseline, meaning that there have been excess deaths for every week since March.
***
Ready for even more modeling?
For most states, the time line reaches the last week of July, which may surprise you. The first part of excess deaths - the observable part - counts all causes, obviating the need to verify coronavirus test results. The measure relies on death certificates. But it may take weeks to months to receive data on death certificates so how does NYT have estimates of excess deaths up to the week prior?
Does NYT have a crystal ball? In a sense, yes. It uses a statistical model to "top up" the incomplete deaths data in the most recent weeks. It turns out that actual deaths are not completely observable always. There is a reporting delay which means that the data for more recent weeks are incomplete. The data age like fine wine - the older the data, the more complete.
The NYT utilized a similar strategy as the model for projecting expected deaths - assuming historical patterns hold. The CDC keeps track of the amount of reporting delay. It might know that typically, 20 percent of the data showed up with one weekrepo's delay, 50 percent with two weeks' delay, and so on.
There is a crucial difference betweeen these two statistical models, though. The first model projects expected deaths in the absence of the novel coronavirus. The second model tops up the actual deaths reported at the time of analysis, essentially projecting the number of deaths already occurred but not yet reported. These actual deaths are in the presence of the coronavirus. This is the crucial difference. This second model has a further assumption - that reporting delays have not been affected by the pandemic.
The NYT team disclosed this assumption clearly in the text. This is what they mean by the following sentences:
Even with this adjustment, it's possible there could be an underestimate of the complete death toll if increased mortality is causing states to lag more than they have in the past or if states have changed their reporting systems.
It's quite likely that the volume of deaths, the budget pressure, and political interference have changed the pattern of the reporting delay.
The reporter then argued that the adjusted estimates are still better than the unadjusted ones. One can't refute that argument. Of course, this discussion assumes that the chart must show a time line leading to the present. The designer can choose to show only those weeks with virtually complete data.
Or, we can bring in a third model. This one looks at the early weeks of the pandemic, for which we believe almost all of the death-certificate data have been recorded. We can then estimate the error of the second model that serves to top up the actual deaths.
***
NYT's excess-death analysis has found over 200,000 more deaths (from all causes) in the U.S. between March and July than expected based on historical patterns. Given the U.S. counted about 140,000 deaths due to Covid-19 during this period, there have been about 60,000 unexplained deaths so far.
I find the NYT headline, "The true coronavirus toll in the US has already surpassed 200,000," alarmist. The analysis did not support this conclusion - unless they make the further assumption that the only factor causing excess deaths during those months is Covid-19. That is plausible but not fact-based. A more correct headline should be "Deaths due to Covid-19 may have been undercounted by up to 40 percent." That's also shocking and sobering.
Or, perhaps, something else caused those 60,000 deaths.
Locking down so people wouldn't go to the ER for chest pain, for instance.
Posted by: Nunya | 08/23/2020 at 01:40 PM
N: The excess deaths analysis is "agnostic" in the sense that the gap can be explained by any cause(s). The analysis shows the scale of this gap - at least 50 percent of the normal level to over 700 percent higher (NYC). For this large a gap, we need new causes. If not the coronavirus, maybe there is a still unknown virus out there.
Here are the challenges one faces to try to use known causes to explain the excess deaths:
1) Any one pre-existing cause, like cancer, will account for say 10 percent of deaths at any time. It's hard to explain how multiple known causes of death suddenly spike sharply.
2) I've yet to see any confirmed death due to inability to go to ER - presumably they can start by investigating deaths at home.
3) Imagine one is able to confirm that lots of deaths are caused by the lockdown. Then, one should expect to see a period of substantial negative excess deaths. If we use existing causes to explain these excess deaths, then every death is by definition an accelerated death so it must lead to a negative excess death at some later time. Do we really think all those excess deaths will be reversed in the statistics later in the year or next year?
Posted by: Kaiser | 08/23/2020 at 11:00 PM