The UK has joined California, Texas, etc. in the list of organizations that couldn't keep their testing data straight. It just admitted that over the last week or so, it under-counted Covid-19 cases by 30 percent.
(Graphic from BBC.)
The official reason is an Excel limitation, which is a brilliant stroke of PR (I have previously praised the PR firm behind the UK's coronavirus response. See here.) This message is intended for those who do not use Excel, which, I guess, is the majority of the UK population. Among those who have been exposed to Excel, there exists a set of vocal dissenters, people who hyperventilate at the mention of Microsoft's software, and they will help propagate the message.
Anyone who uses Excel a lot will not fall for this.
Unlike California and Texas, where the officials remain mum about the details of their supposed faux pas, the UK government explained to the Daily Mail what this Excel error is. We are to believe that there is a master spreadsheet recording all cases, that spreadsheet ran out of space, nobody noticed it until a backlog of 16,000 cases has stacked up, and they fixed the issue by splitting the data onto multiple spreadsheets.
One word: implausible.
***
The limit on the number of rows on a single Excel spreadsheet is above 1 million. The cumulative number of cases in the UK as of October 2 was fewer than 500,000. Assuming that they collect data for each individual case, so that there is one row of data per case, the spreadsheet should be at least 50% empty.
I'm not convinced that's the workflow. It would seem more likely that there is one line per reporting unit, not one line per individual case. If that is so, then the spreadsheet is even smaller.
Let's play along and assume somehow someone created a process filled with duplicates and errors or other horrors such that the spreadsheet manages to overrun 1 million rows.
We now have to believe that the Excel spreadsheet is maintained by a machine without a single human opening it. If a human is transferring the data onto this master spreadsheet, as I expect to be the scenario, how is it that this live person is not alert to the space outage? One simply cannot add any more rows to the bottom of the sheet!
If a machine is programmed to add rows to this spreadsheet, we now have to believe that neither Excel nor the program returns any error messages or warnings when the spreadsheet runs out of room!
Let's keep playing along. Against all odds, the spreadsheet is somehow completely full and nothing more can be added. Neither humans nor machines notice anything odd. The new data simply fell into a black hole.
Now, how do we explain that the case counts are increasing each day for a week before they noticed the missing cases?
From September 25 to October 2, the period in question, the UK reported on average 4,368 cases before admitting the mistake. The revised data bumped the daily average to 6,348 cases, which means the public was given a number 30 percent below the real tally.
According to the Daily Mail, "some 16,000 confirmed infections had to be added to the daily totals running back more than a week." If we buy the Excel story, what should have happened is that one day, the spreadsheet ran out of space, and from that point on, the case count plateaued with no more additions.
What should have happened is suddenly no more cases are recorded, and the error is immediately obvious.
P.S. Lets' keep playing. So they keep one gigantic spreadsheet not just of positive cases but of all test results from which they extract the cases. As of October 2, the U.K. said they have "processed" 22.7 million PCR tests. Remember the limit of a single spreadsheet is 1 million, which has been passed long time ago.
***
Dr. Backlog sure chose a curious time to jump the pond. The undercounting occurs right as cases were spiking in the U.K., France, Spain, and all over Europe, and governments had to make the difficult decision of whether or not to re-impose restrictions to ward off a second wave.
Yes I will be the first to admit the data world is messy, and there can be innocent explanations for data discrepancies. However, any experienced data analyst realizes one cannot lazily cling to the most convenient explanation; all possible explanations must be investigated, including politically incorrect ones.
Recent Comments