There is no bigger news in 2020 than the pandemic. This once-in-a-lifetime (hopefully) event has shadowed much of my blog writing last year. I'm glad you've stuck around, and if you've been reading from the start, you'd have been ahead of the curve on all the key issues. Data have been central to navigating this new world, however messy, sparse, and incomplete they are. I'm confident you've picked up more about statistics on these pages than in an entire statistics course.
In this post, I've selected some of the posts that have aged particularly well. If you missed these before, be sure to read them now!
***
In mid March, as New York was poised to enter a lockdown, I predicted that this lockdown would have a different ending than those in China or South Korea.
The second type of lockdown - soon to be exemplified by the U.S. - is based on severely limited data. It is based on the assumption that everyone has been infected. The necessity of this kitchen-sink assumption is driven by the lack of knowledge. It is a position of weakness. The duration of the lockdown will be longer, the spread of infections will be wider, and the long-term costs will be higher. It is not the case that everyone has the coronavirus but we have to assume it because we are not testing nearly enough people.
Nine months later, the U.S. has not tamed the virus while the economy has suffered.
(1) https://junkcharts.typepad.com/numbersruleyourworld/2020/03/two-types-of-lockdowns-why-us-must-rethink-its-testing-policy.html
In March and April, when news outlets were one-upping each other with variations of log-scaled charts showing the "exponential" rise in Covid-19 cases, I pointed out that the growth curves were sub-exponential, which also means the log scale is inappropriate. Today, log-scaled charts have disappeared because they have been proven useless.
(2) https://junkcharts.typepad.com/numbersruleyourworld/2020/03/note-about-fitting-and-visualizing-exponential-models.html
(3) https://junkcharts.typepad.com/numbersruleyourworld/2020/04/new-video-validating-data-science-models-a-case-study-with-covid-19-data.html
(4) https://junkcharts.typepad.com/numbersruleyourworld/2020/04/another-note-on-fitting-exponential-models.html
(5) https://junkcharts.typepad.com/junk_charts/2020/06/presented-without-comment.html
In May, I was already drawing the connection between advocating antibody testing and pushing herd immunity:
In places that have fallen down on rolling out diagnostic testing (the U.S. and the U.K., for example), the officials have started talking up “antibody testing”... Antibody testing is a rear view mirror. If someone is found to have antibodies, it’s because the person has recovered from the disease. It follows that the person was infected, and so the person could have died. Antibody testing without diagnostic testing means death over life.
In the same post, I made the following alarming calculation:
With a 0.1% fatality rate, conservatively set at that of influenza, the U.S. would suffer over 200,000 deaths to get there. The path to herd immunity is paved with body bags.
The current U.S. death toll is over 300,000 and counting.
(6) https://junkcharts.typepad.com/numbersruleyourworld/2020/05/the-shift-to-antibody-testing-is-a-choice-of-death-over-life.html
Nevertheless, several countries (U.S., U.K., and Sweden, being the most prominent) continued to deny advancing herd-immunity policies, while most observers recognized them as such. In June, I pointed out that one cannot both believe the coronavirus is the common flu and support herd immunity through natural infections because influenza is seasonal, signifying that natural infection does not confer immunity.
(7) https://junkcharts.typepad.com/numbersruleyourworld/2020/06/coronavirus-has-not-gone-away-neither-have-these-covid-fallacies.html
In October, I objected to herd immunity as a public health objective as it rewards the wrong behavior. I warned against bystander behavior because "you have tied your health to the actions of people who hold opposite views as you." Herd immunity is mathematical immunity, not biological immunity.
(8) https://junkcharts.typepad.com/numbersruleyourworld/2020/10/i-dont-like-the-term-herd-immunity.html
At the end of October, I pronounced the death of the Swedish experiment. I have seen enough. If your soccer (football) team goes down 0-10 in the first 10 minutes of a match, what choice words do you have for the coach who is preaching patience, waiting for the comeback win? By mid-December, Sweden's King admitted that "we have failed."
(9) https://junkcharts.typepad.com/numbersruleyourworld/2020/10/the-swedish-mirage-the-verdict-is-already-written.html
At the time, I assumed 75 percent of the population would need to be vaccinated to attain herd immunity. Many experts placed the number much lower, about 40-60 percent (link). By end of the year, Dr. Fauci is estimating a range of 70-85%.
In one of my very first posts about the coronavirus in early March, I predicted the thorny data quality issues that would bedevil analysts to this day:
We have very limited amounts of data (the known and suspected cases), incomplete data (under-reporting, possible hiding or manipulation), data of dubious quality (self-reporting, hastily assembled tests, unpreparedness), assumptions that later turn out to be false, the potential black-swan fallacy (failure to imagine what hasn't happened before), and non-stationary data (old data might not be wrong but become outdated as the virus evolves).
(10) https://junkcharts.typepad.com/numbersruleyourworld/2020/03/eight-unanswered-questions-about-the-coronavirus.html
In March, I already warned about "the elephant in the room: people are counted as infected only if they are tested", and asked how deaths are confirmed. It would take weeks of regurgitating official counts before the mainstream media acknowledged these shortcomings.
(11) https://junkcharts.typepad.com/numbersruleyourworld/2020/03/eight-unanswered-questions-about-the-coronavirus.html
In April, I explained the benefit of "excess deaths" while also noting why death statistics are subject to their own set of issues. It is here that I said all excess deaths are accelerated deaths. (I will have another new post about this point in early 2021.)
(12) https://junkcharts.typepad.com/numbersruleyourworld/2020/04/excess-deaths-why-we-want-the-cake-and-why-we-cant-eat-it-yet.html
In May and then in December, while studying the data on excess deaths in Florida, I created the following graphic showing the sale.
(13) https://junkcharts.typepad.com/junk_charts/2020/05/how-covid-19-deaths-sneaked-into-floridas-statistics.html
(14) https://junkcharts.typepad.com/numbersruleyourworld/2020/12/its-time-to-visit-florida-again-graphically.html
In May, when Dr. Birx the politician complained that CDC data was terrible, I responded thus:
The members of the Task Force can immediately help make the data better if they cared about the quality of the data. When they aren't pushing diagnostic testing, while trashing the data, their real perceived enemy is not bad data but data.
(15) https://junkcharts.typepad.com/numbersruleyourworld/2020/05/their-real-enemy-is-not-bad-data-but-data-period.html
By June, the media have gone through the five stages of grief:
In the short history of the Covid pandemic, people started with case statistics. Then, they claimed that death statistics are less manipulated than case statistics, when they learned about testing. Then, they claimed that testing statistics are less manipulated, until they realized governments determined who got tested. Some governments counted tests shipped, not test results. Then, they claimed hospitalization statistics are less manipulated, until they learned that hospitals sent sick patients to nursing homes.
(16) https://junkcharts.typepad.com/numbersruleyourworld/2020/06/coronavirus-has-not-gone-away-neither-have-these-covid-fallacies.html
The antibody testing lobby has a particular disregard for data quality. New York's Governor Cuomo, whom I heard won an Emmy for his press conferences, claimed for weeks that 20 percent of New Yorkers had already been infected by April, citing "randomized" antibody testing at supermarkets (during lockdown). To this day, the only "datum" provided by the state's health department which conducted these tests is this FAQ. (Click with no expectation!)
Around the same time, an Oxford research team made headlines, claiming 40 percent of the U.K. population were already infected based on antibody testing. This finding was laughable at the time, and embarrassing given the latest surge in the U.K. In April, I wrote a series of six posts about the Oxford study that explains how scientists build models based on sparse data to make such predictions.
(17) https://junkcharts.typepad.com/numbersruleyourworld/2020/04/a-primer-to-understanding-statistical-models-such-as-the-oxford-coronavirus-study-1.html
(18) https://junkcharts.typepad.com/numbersruleyourworld/2020/04/a-primer-to-understanding-statistical-models-such-as-the-oxford-coronavirus-study-2.html
(19) https://junkcharts.typepad.com/numbersruleyourworld/2020/04/a-primer-to-understanding-statistical-models-such-as-the-oxford-coronavirus-study-3.html
(20) https://junkcharts.typepad.com/numbersruleyourworld/2020/04/a-primer-to-understanding-statistical-models-such-as-the-oxford-coronavirus-study-4.html
(21) https://junkcharts.typepad.com/numbersruleyourworld/2020/04/a-primer-to-understanding-statistical-models-such-as-the-oxford-coronavirus-study-5.html
(22) https://junkcharts.typepad.com/numbersruleyourworld/2020/04/a-primer-to-understanding-statistical-models-such-as-the-oxford-coronavirus-study-6.html
The lead investigator of the Oxford study turns out to be a co-author of the Great Barrington Declaration, which pushes the amoral and dubious theory of herd immunity through natural infections.
Another author of that Declaration is a Stanford professor who is infamous for his own role in the debunked study of antibody testing in Santa Clara, California.
(23) https://junkcharts.typepad.com/numbersruleyourworld/2020/04/the-percent-with-antibodies-could-be-3-percent-or-it-could-be-zero-thats-what-the-stanford-study-rea.html
The first draft of the Stanford study neglected the enormous false-positive problem associated with testing rare events, a topic which I covered in depth in Chapter 4 of Numbers Rule Your World (link).
(24) https://junkcharts.typepad.com/numbersruleyourworld/2020/04/to-justify-zero-false-positive-assumption-stanford-study-needs-a-reference-sample-10-times-bigger.html
The corrected manuscript reveals statistical issues with estimating test accuracy when developing new tests, and failed to quiet the skeptics.
(25) https://junkcharts.typepad.com/numbersruleyourworld/2020/05/the-fda-looks-past-these-statistical-issues-but-maybe-you-shoudnt.html
Most importantly, the antibody testing lobby lost credibility in the face of real-world evidence. They predicted a high level of immunity from natural infections, which would prevent a second wave. Everywhere they made that prediction - Sweden, U.K., California, reality bit back, hard.
Meanwhile, mainstream epidemiology has triumphed. Certain colleges re-opened in the fall, implementing strict test, trace and isolate protocols that were swatted aside by the U.S. government. I wrote about the successes at Cornell and Georgia Tech.
(26) https://junkcharts.typepad.com/numbersruleyourworld/2020/09/good-news-from-cornell.html
(27) https://junkcharts.typepad.com/numbersruleyourworld/2020/08/nice-explainer-from-georgia-tech-about-their-re-opening-model.html
Epidemiologists also have predicted the rise of deaths following rise of cases. This lag is frequently ignored by journalists who are professionally trained to think only in the present, and not one or two weeks ahead, and by politicians who say black is white. In a series of posts, I showed how to see the time delay between cases and deaths, in Lombardia (Italy), Texas, Illinois and Iowa.
(28) https://junkcharts.typepad.com/numbersruleyourworld/2020/04/how-to-act-like-a-data-scientist-9-quantifying-our-intuition.html
(29) https://junkcharts.typepad.com/numbersruleyourworld/2020/06/how-to-act-like-a-data-scientist-12-say-no-to-the-daily-grind.html
(30) https://junkcharts.typepad.com/numbersruleyourworld/2020/07/its-high-time-to-call-out-those-who-got-it-right.html
(31) https://junkcharts.typepad.com/numbersruleyourworld/2020/12/the-importance-of-matching-data.html
As the year ends, we received good news from several vaccine trials. In July, I complained that Moderna provided almost no useful information in its submission to ClinicalTrials.gov, the supposed center of open science.
(32) https://junkcharts.typepad.com/numbersruleyourworld/2020/07/were-told-almost-nothing-about-the-moderna-vaccine-trial.html
By September, Moderna was pressured to release detailed protocols. So followed other vaccine developers. Once these protocols were released, I computed that the earliest approval date for vaccines would be mid January 2021. At the time, the media reiterated unrealistic timelines suggesting results by Election Day (early November).
(33) https://junkcharts.typepad.com/numbersruleyourworld/2020/09/notes-on-the-moderna-vaccine-trial-protocol-1.html
(34) https://junkcharts.typepad.com/numbersruleyourworld/2020/10/political-headline-decrying-politics.html
If the pharmas adhered to those timelines - the Pfizer CEO was pretending they could, it implied that the trial participants would have been observed for only one month, which would at best prove vaccine efficacy for one month. Such a possibility was squashed when the FDA "strengthened" guidelines around the observation window.
(35) https://junkcharts.typepad.com/numbersruleyourworld/2020/09/behind-the-proposed-strengthening-of-fda-vaccine-approval-rules.html
This guideline was not as strong as it could be. I proposed that all participants - instead of the median participant - be observed for at least two months. A week later, a group of prominent statisticians sent a letter to the FDA, making this same point.
When reviewing the trial results, I noted that neither Moderna nor Pfizer met that FDA guideline for median observation window. This is one of numerous tradeoffs made to rush the approval process.
(36) https://junkcharts.typepad.com/numbersruleyourworld/2020/12/the-worst-kept-secret-will-be-revealed-today.html
(37) https://junkcharts.typepad.com/numbersruleyourworld/2020/12/the-second-worst-kept-secret-of-the-pandemic-year-and-some-decimal-mischief.html
The metric of vaccine efficacy is frequently misinterpreted. I wrote about VP Mike Pence's tweet, claiming that 90% VE means the vaccine prevents infections in 90% of volunteers. Not even close. Most of the trial participants have not yet been exposed to the virus so they would not get infected whether they got the vaccine shot or the placebo.
(38) https://junkcharts.typepad.com/numbersruleyourworld/2020/10/cutting-the-infection-rate-by-half-is-not-the-same-as-protecting-half-the-people-from-infection.html
(39) https://junkcharts.typepad.com/numbersruleyourworld/2020/11/watching-statistical-gravity-in-action-elections-and-vaccines.html
These vaccine trial results demonstrate a key tenet of designing experiments: decisions have consequences.
Neither Pfizer nor Moderna has proven vaccine efficacy for two months. Their results are based on less than half the participants reaching the two-month mark. If the FDA rule had been all participants reaching two months of observation, then yes.
(40) https://junkcharts.typepad.com/numbersruleyourworld/2020/12/unexpected-sightings-and-unknowns-from-the-pfizer-result.html
We cannot conclude that one dose gives partial protection for two months because no one tested single doses. Anyone suggesting otherwise makes the assumption that the second dose has no value, creating a tautology, saying the same thing twice.
(41) https://junkcharts.typepad.com/numbersruleyourworld/2020/12/one-dose-pfizer-is-not-happening-and-heres-why.html
Any analysis of subgroups (race, age, severe cases, etc.) should be ignored. The trials were designed to minimize total sample size so as to achieve warp speed; the casualty of this design is subgroup analysis.
(42) https://junkcharts.typepad.com/numbersruleyourworld/2020/12/the-worst-kept-secret-will-be-revealed-today.html
Throughout this pandemic, the U.S. government, apparently supported by CDC experts, ignores the threat of asymptomatic spread. The policy of triage testing discourages people without symptoms from getting tested. The design of the vaccine trials - with case definitions based on self-reported symptoms - produces no knowledge about asymptomatic transmission. In March, I wrote a column in Wired to call for broad-based testing.
(43) https://www.wired.com/story/the-problem-with-trumps-triage-testing/
(44) https://junkcharts.typepad.com/numbersruleyourworld/2020/04/when-will-we-get-more-and-broader-testing.html
The clinical trials provide higher quality data compared to research studies that use observational data. I outlined the many problems of observational studies that were thrown together in haste, and published as preprints. These include a Yale study of sewage, a Harvard study of parking lot images and search engine traffic, and a Kings College (UK) study of mobile tracking app data.
(45) https://junkcharts.typepad.com/numbersruleyourworld/2020/05/yale-meds-please-meet-yale-stats.html
https://junkcharts.typepad.com/numbersruleyourworld/2020/06/a-glimpse-into-surveillance-data-via-the-harvard-study-of-parking-lots.html
(46) https://www.wired.com/story/beware-the-lofty-promises-of-covid-19-tracker-apps/
(47) https://junkcharts.typepad.com/numbersruleyourworld/2020/05/it-begins-with-the-data.html
(48) https://junkcharts.typepad.com/numbersruleyourworld/2020/04/pre-processing-data-to-avoid-putting-your-foot-in-that-puddle.html
All these examples show that a model that explains the past well does not necessarily predict the future.
(49) https://junkcharts.typepad.com/numbersruleyourworld/2020/05/prediction-versus-explanation-simple-to-state-yet-hard-to-grasp.html
Regarding the Covid-19 symptom tracking app, I explained how the analytical methodology for estimating population prevalence can justify any level of prevalence. That was back in April. We have barely heard a pip about Covid apps since then.
(50) https://junkcharts.typepad.com/numbersruleyourworld/2020/04/the-first-major-study-using-covid-tracking-app-data-stumbles-out-of-the-gate.html
Between blogging, I was a guest on the CSAI podcast with John Reid and Rusen Aktas:
(51) https://anchor.fm/csai-podcast/episodes/CSAI-2--Covid-19-and-Data-Science-ege3si
More recently, on Ryan Ray's podcast, I talked about how I wrote my books:
(52) https://junkcharts.typepad.com/junk_charts/2020/11/podcast-highlights.html
On the sister blog (Junk Charts), I featured several nice data visualizations. SCMP published this beautiful picture of contact tracing efforts in Hong Kong:
(53) https://junkcharts.typepad.com/junk_charts/2020/09/unlocking-the-secrets-of-a-marvellous-data-visualization.html
The New York Times visualized excess deaths in the U.S.:
(54) https://junkcharts.typepad.com/junk_charts/2020/08/deaths-as-percent-neither-of-cases-nor-of-population-deaths-as-percent-of-normal.html
The New York Times traced the exodus of the rich from NYC:
(55) https://junkcharts.typepad.com/junk_charts/2020/06/designs-of-two-variables-map-dot-plot-line-chart-table.html
The Visual Capitalist depicted how we changed our consumption habits:
(56) https://junkcharts.typepad.com/junk_charts/2020/05/consumption-patterns-during-the-pandemic.html
***
I wrote 212 posts across the two blogs in 2020. Since I started, people have read posts more than 4 million times. Thank you for reading, come back often, dig into the archives, and recommend me to your friends and colleagues. See you in 2021!
Kaiser
[1-1-2020: Added an index to the links, and corrected a link error.]
Recent Comments