(Source: xkcd)
Chapter 2 of Numbersense (link) is titled "Can a new statistic make us less fat?" In the context of recent remarks by the acting New York State Governor, perhaps I should have called it "Can a new statistic save some people from dying?"
Trick question:
An executive tracks performance using a key metric displayed on a dashboard. The executive requests a review of the definition of the metric, then decides the definition must be revised, and after the revision, the new metric looks better than before. Which of the following is true?
a) Using the prior definition, the key metric was outperforming expectation
b) Using the prior definition, the key metric was underperforming expectation
***
The New York TImes reported that "Ms. Hochul said that some hospital executives have told her between 20 and 50 percent of their Covid patients are not suffering from severe symptoms, but that they are testing positive in the hospital incidentally, after being admitted for other reasons such as car accidents...As a result, beginning Tuesday, the state will begin to ask hospitals to break down how many patients are being admitted for acute Covid-19 symptoms, in an effort to further decipher this wave’s severity."
This is statistical alchemy that is frequently requested by people who don't like what the numbers are showing. It is a trick because they did not commit to restating the entire data series back to the start of data collection. This aforementioned issue is a "problem" only if they had recently changed the counting rules to include admissions for non Covid-19 reasons, therefore causing recent data to be inflated relative to older data, but that's not what they're saying.
Further, notice that they are commingling two separate re-definitions: one is Covid-19 vs not Covid-19; the other is mild vs "acute" Covid-19. I have never heard that "cases" only refer to "acute" Covid-19 cases, until now. The new definition relies on a subjective definition of "acute" and so they can report whatever number they want going forward.
To have any credibility, they need to (a) publish all the definitions that have been deployed in the last 2+ years, including the latest one, and (b) apply the latest definition to restate all historical data. Otherwise, no one can tell whether the trend is up or down since we are not comparing apples to apples!
To have maximum credibility, they should do (c) a comprehensive review of the definition, and correct for problems that undercount as well as overcount. For example, are they missing lots of cases found by people doing at-home testing? How are they dealing with people with multiple test results? (This is no small matter: if an infected person will take further tests till s/he tests negative, then each positive test is likely to be balanced by one or more negative tests in the near future. This dynamic affects how we can interpret the time series of positive tests divided by number of tests.)
***
In Numbersense (link), I trace how every few years, someone writes an article saying we need a new definition of obesity. The typical reason is that BMI classifies fit athletes as obese. I have never seen someone provide an analysis of how BMI misclassifies the average person. It's always the outliers. Notice in the above quote, the Governor did not give an aggregate proportion of what she considers misclassified. She said "some" hospitals gave her a concerning statistic.
As a data analyst, you have to think about the likelihood that the sample of hospitals she heard from is representative of all hospitals. Imagine you're working at a hospital for which most of the patients testing positive for Covid-19 are being treated for Covid-19. How likely are you to pick up the phone and call the Governor to tell her that you do not have a problem?
Thus, the sample she is working with is biased towards those hospitals experiencing this "problem". The real proportion of cases being "misclassified" is surely lower than what she announced.
Its always good to follow your postings because they can seem to be linked togethrr.
Look at case of Djokovic where every day there is a change in metric that change the knowledge as you say in your two posts.
First he get dome exemption for positive PCR..
Next he stopped to enter country.
Next some newspaper say his test not positive.
So for each the conclsion is changed.
The PCR test is gold standardbfor infection detecting
The PC result cannot be trusted
The PCR result can be manipulated...so it cannot be trusted...( and you data can easily be found ) so possibly manipulate so it cannot be trusted.
So does it link this story of last two posts?
The story is still continuing.....
Posted by: A Palaz | 01/13/2022 at 06:55 PM
AP: Yeah, we want to know how he got into Spain...
It is true that there is always imperfection in any metric. If someone is really interested in improving metrics, they should re-evaluate the metric holistically, and they must restate historical data. Unfortunately, the latter provides an opening for fraudsters. It may not be possible to restate history. If they don't, that creates a break in the data series so that the historical data must be discarded. That's like stopping the match, changing all the rules and then restarting from 0-0.
Posted by: Kaiser | 01/13/2022 at 11:38 PM