If you're not counted, you don't count.
The first law of counting is to count everything, and I do mean everything - not just the things you think you care about.
***
Imagine - as I did in Chapter 1 of Numbersense (link) - that a college advertises that 95% of its graduates found jobs within 6 months of graduation. It turns out the school collects data from a post-graduation survey - it's not disclosed anywhere in the marketing materials that it's 95% of survey respondents, not 95% of graduates. What's also not reported is the survey's response rate, which was 20%.
Nevertheless, the college administrators stand behind their statistics. They say the respondents and non-respondents did not show statistically significant differences in their gender distribution, average age, and type of degree; therefore, they assume that non-respondents also have a 95% employment rate within 6 months.
Do we think graduates who failed to land jobs have the same chance of filling out an employment survey as those who are gainfully employed after graduation? You can draw your own conclusion.
If you only have biased data on 20% of the graduates of this school, do you think you have a full understanding of the employment prospects of its graduates?
***
At the start of the Covid-19 pandemic - think March or April 2020, the U.S. suffered from a shortage of diagnostic tests. The official policy at the time was to test only people with severe symptoms.
We knew next to nothing about the novel coronavirus at the time. The data on cases became the basis of our knowledge - not only of the severity of the pandemic but also its trends, its demographics, its symptoms and injuries, and so on.
But triage testing leads to biased data. When the pandemic also grows quietly through asymptomatic transmission, our data led to a partially blind, misleading assessment of the situation. I warned about this in a Wired column - subsequently, colleges have shown that comprehensive testing is crucial to controlling the spread of Covid-19. Random testing of small samples of the population is a viable alternative we can still implement today to monitor new variants.
***
A curious thing happened during the pandemic to our unemployment statistics. Eight million Americans not only lost their jobs overnight but suddenly lost all interest in working. That's according to official statistics, which showed that the number of people in the "civilian labor force" dropped by 8 million.
People "not in the labor force" have always been a strange species. The government claims that these people either do not want work at all, or have not been looking for a job for an extended period of time. Their unemployed status is ignored in the unemployment rate. Oddly, as more people drop out of the labor force, the unemployment rate improves, all else being equal. This group is simply not counted; they are neither employed nor unemployed.
If one has the responsibility of reducing the unemployment rate, one will focus on unemployed people who are counted in the data, not the ones who are not counted. When you are not counted, you don't count.
In Chapter 6 of Numbersense (link), I explain how the government computes the unemployment rate in detail. Not counting a group of unemployed people hides some of the bad news. As I pointed out in this previous blog post, the sharp exodus out of the labor force at the start of the pandemic proves the serious shortcoming of the concept. None of the people who just lost their jobs due to Covid-19 are people who do not want work at all; their recently-ended employment proves otherwise. There also was not enough time that passed to qualify them as having given up on a job search.
***
The Bureau of Labor Statistics (BLS) does count everyone, including data on how many people have been removed from the labor force tally. These data just show up in an appendix, with much less attention and care.
We can't say as much about how the CDC counts Covid-19 cases in 2021. As this New York Times story disclosed, the CDC "stopped investigating breakthrough infections among fully vaccinated people unless they become so sick that they are hospitalized or die."
Just as BLS defines unemployed people as not really unemployed if they "don't want to work", CDC defines infected people as not really infected if they are fully vaccinated and are not hospitalized for severe Covid-19 symptoms. In this way, our data have come full circle, back to the same sad state that it was in March or April 2020. It tells us only about severe cases. To the extent that new variants spread through asymptomatic infections, we are as blind to that possibility as we were a year ago.
Furthermore, the change in counting rules upsets any trend analysis. Consider a real-world study of vaccine effectiveness. If the researchers report that the case rate of fully vaccinated people is much lower than the case rate of unvaccinated people, is it because of the vaccine or because we count mild or asymptomatic cases for unvaccinated people but not for vaccinated people?
It's not just fun and games with counting. Don't forget: infected, hospitalized, or dead people who got sick before 14 days are also treated as not really infected, hospitalized or dead. When you're not counted, you don't count. The CDC turns a blind eye to them because they happen to catch the disease prior to reaching so-called full vaccination status. Why do their lives not matter? That's something I have been wondering about throughout the pandemic, and I still don't have a good answer.
Recent Comments