This exercise plan for your lock-down work-out is inspired by Venn
Apr 23, 2020
A twitter follower did not appreciate this chart from Nature showing the collection of flu-like symptoms that people reported they have to an UK tracking app.
It's a super-complicated Venn diagram. I have written about this type of chart before (see here); it appears to be somewhat popular in the medicine/biology field.
A Venn diagram is not a data visualization because it doesn't plot the data.
Notice that the different compartments of the Venn diagram do not have data encoded in the areas.
The chart also fails the self-sufficiency test because if you remove the data from it, you end up with a data container - like a world map showing country boundaries and no data.
If you're new here: if a graphic requires the entire dataset to be printed on it for comprehension, then the visual elements of the graphic are not doing any work. The graphic cannot stand on its own.
When the Venn diagram gets complicated, teeming with many compartments, there will be quite a few empty compartments. If I have to make this chart, I'd be nervous about leaving out a number or two by accident. An empty cell can be truly empty or an oversight.
Another trap is that the total doesn't add up. The numbers on this graphic add to 1,764 whereas the study population in the preprint was 1,702. Interestingly, this diagram doesn't show up in the research paper. Given how they winnowed down the study population from all the app downloads, I'm sure there is an innocent explanation as to why those two numbers don't match.
***
The chart also strains the reader. Take the number 18, right in the middle. What combination of symptoms did these 18 people experience? You have to figure out the layers sitting beneath the number. You see dark blue, light blue, orange. If you blink, you might miss the gray at the bottom. Then you have to flip your eyes up to the legend to map these colors to diarrhoea, shortness of breath, anosmia, and fatigue. Oops, I missed the yellow, which is the cough. To be sure, you look at the remaining categories to see where they stand - I've named all of them except fever. The number 18 lies outside fever so this compartment represents everything except fever.
What's even sadder is there is not much gain from having done it once. Try to interpret the number 50 now. Maybe I'm just slow but it doesn't get better the second or third time around. This graphic not only requires work but painstaking work!
Perhaps a more likely question is how many people who had a loss of smell also had fever. Now it's pretty easy to locate the part of the dark gray oval that overlaps with the orange oval. But now, I have to add all those numbers, 69+17+23+50+17+46 = 222. That's not enough. Next, I must find the total of all the numbers inside the orange oval, which is 222 plus what is inside the orange and outside the dark gray. That turns out to be 829. So among those who had lost smell, the proportion who also had fever is 222/(222+829) = 21 percent.
How many people had three or more symptoms? I'll let you figure this one out!