If you're a data analyst reading this blog, take a look at the following data table, and tell me what annoys you most about it.
I came across this table when researching mortality statistics for the U.S. The CDC has this labyrinth of a website, which apparently trick all the search engines so it took me a long time to land on this page. It supposedly contains weekly mortality numbers for the State of Florida; a "threshold" which is the upper bound of a projection using a model; and excess deaths defined as the difference between the observed deaths and the projected deaths.
(If you want a hint to find issue #1, start with how this table is supposedly sorted.)
For extra credit, this next section of the table reveals more problems:
***
For people managing data analysts, this is why even simple things may take forever! Most data come with traps and potholes that must be dealt with before anything can get done. This example is structured data (in tabular form). If the data come as "unstructured", the issues multiply.
I don't see why it's that hard to analyze these data.
The answer is clearly 134.
Posted by: Matt VE | 05/06/2020 at 03:37 PM
MVE: lol, I started figuring out the signficance of 134 but thought there is a better use of my time. I did enough to know that it's a different number for each state.
Posted by: Kaiser | 05/06/2020 at 05:11 PM
If (Percent excess < 42 = 0; ANDIF (Total number of excess deaths; = 134))
Posted by: GIGO | 05/07/2020 at 01:51 PM
The date field is clearly a formatting mess. The number of excess deaths looks wrong. That might be a calculation issue. Just some thoughts off top of my head. Funny that regardless of where you get data, it always has issues.
Posted by: Pat Valente | 05/07/2020 at 03:57 PM
PV: The date field is the first thing I saw. The arrow seemed to suggest it is sorted in descending order. It's sorted left to right sort of. The next three columns are useless, constant throughout. Number of excess is also wrong as you and others noticed. But what about percent excess? That seems also messed up. Especially if you look at the second table, which apparently a positive value of percent excess does not translate to a change in number of excess.
Posted by: Kaiser | 05/07/2020 at 11:11 PM