Trying too hard

Today, I return to the life expectancy graphic that Antonio submitted. In a previous post, I looked at the bumps chart. The centerpiece of that graphic is the following complicated bar chart.

Aburto_covid_lifeexpectancy

Let's start with the dual axes. On the left, age, and on the right, year of birth. I actually like this type of dual axes. The two axes present two versions of the same scale so the dual axes exist without distortion. It just allows the reader to pick which scale they want to use.

It baffles me that the range of each bar runs from 2.5 years to 7.5 years or 7.5 years to 2.5 years, with 5 or 10 years situated in the middle of each bar.

Reading the rest of the chart is like unentangling some balled up wires. The author has created a statistical model that attributes cause of death to male life expectancy in such a way that you can take the difference in life expectancy between two time points, and do a kind of waterfall analysis in which each cause of death either adds to or subtracts from the prior life expectancy, with the sum of these additions and substractions leading to the end-of-period life expectancy.

The model is complicated enough, and the chart doesn't make it any easier.

The bars are rooted at the zero value. The horizontal axis plots addition or substraction to life expectancy, thus zero represents no change during the period. Zero does not mean the cause of death (e.g. cancer) does not contribute to life expectancy; it just means the contribution remains the same.

The changes to life expectancy are shown in units of months. I'd prefer to see units of years because life expectancy is almost always given in years. Using years turn 2.5 months into 0.2 years which is a fraction, but it allows me to see the impact on the reported life expectancy without having to do a month-to-year conversion.

The chart highlights seven causes of death with seven different colors, plus gray for others.

What really does a number on readers is the shading, which adds another layer on top of the hues. Each color comes in one of two shading, referencing two periods of time. The unshaded bar segments concern changes between 2010 and "2019" while the shaded segments concern changes between "2019" and 2020. The two periods are chosen to highlight the impact of COVID-19 (the red-orange color), which did not exist before "2019".

Let's zoom in on one of the rows of data - the 72.5 to 77.5 age group.

Screen Shot 2022-09-14 at 1.06.59 PM

COVID-19 (red-orange) has a negative impact on life expectancy and that's the easy one to see. That's because COVID-19's contribution as a cause of death is exactly zero prior to "2019". Thus, the change in life expectancy is a change from zero. This is not how we can interpret any of the other colors.

Next, we look at cancer (blue). Since this bar segment sits on the right side of zero, cancer has contributed positively to change in life expectancy between 2010 and 2020. Practically, that means proportionally fewer people have died from cancer. Since the lengths of these bar segments correspond to the relative value, not absolute value, of life expectancy, longer bars do not necessarily indicate more numerous deaths.

Now the blue segment is actually divided into two parts, the shaded and not shaded. The not-shaded part is for the period "2019" to 2020 in the first year of the COVID-19 pandemic. The shaded part is for the period 2010 to "2019". It is a much wider span but it also contains 9 years of changes versus "1 year" so it's hard to tell if the single-year change is significantly different from the average single-year change of the past 9 years. (I'm using these quotes because I don't know whether they split the year 2019 in the middle since COVID-19 didn't show up till the end of that year.)

Next, we look at the yellow-brown color correponding to CVD. The key feature is that this block is split into two parts, one positive, one negative. Prior to "2019", CVD has been contributing positively to life expectancy changes while after "2019", it has contributed negatively. This observation raises some questions: why would CVD behave differently with the arrival of the pandemic? Are there data problems?

***

A small multiples design - splitting the period into two charts - may help here. To make those two charts comparable, I'd suggest annualizing the data so that the 9-year numbers represent the average annual values instead of the cumulative values.

 

 


Speedometer charts: love or hate

Pie chart hate is tired. In this post, I explain my speedometer hate. (Also called gauges,  dials)

Next to pie charts, speedometers are perhaps the second most beloved chart species found on business dashboards. Here is a typical example:

Speedometers_example

 

For this post, I found one on Reuters about natural gas in Europe. (Thanks to long-time contributor Antonio R. for the tip.)

Eugas_speedometer

The reason for my dislike is the inefficiency of this chart form. In classic Tufte-speak, the speedometer chart has a very poor data-to-ink ratio. The entire chart above contains just one datum (73%). Most of the ink are spilled over non-data things.

This single number has a large entourage:

- the curved axis
- ticks on the axis
- labels on the scale
- the dial
- the color segments
- the reference level "EU target"

These are not mere decorations. Taking these elements away makes it harder to understand what's on the chart.

Here is the chart without the curved axis:

Redo_eugas_noaxis

Here is the chart without axis labels:

Redo_eugas_noaxislabels

Here is the chart without ticks:

Redo_eugas_notickmarks

When the tick labels are present, the chart still functions.

Here is the chart without the dial:

Redo_eugas_nodial

The datum is redundantly encoded in the color segments of the "axis".

Here is the chart without the dial or the color segments:

Redo_eugas_nodialnosegments

If you find yourself stealing a peek at the chart title below, you're not alone.

All versions except one increases our cognitive load. This means the entourage is largely necessary if one encodes the single number in a speedometer chart.

The problem with the entourage is that readers may resort to reading the text rather than the chart.

***

The following is a minimalist version of the Reuters chart:

Redo_eugas_onedial

I removed the axis labels and the color segments. The number 73% is shown using the dial angle.

The next chart adds back the secondary message about the EU target, as an axis label, and uses color segments to show the 73% number.

Redo_eugas_nodialjustsegments

Like pie charts, there are limited situations in which speedometer charts are acceptable. But most of the ones we see out there are just not right.

***

One acceptable situation is to illustrate percentages or proportions, which is what the EU gas chart does. Of course, in that situation, one can alo use a pie chart without shame.

For illustrating proportions, I prefer to use a full semicircle, instead of the circular sector of arbitrary angle as Reuters did. The semicircle lends itself to easy marks of 25%, 50%, 75%, etc, eliminating the need to print those tick labels.

***

One use case to avoid is numeric data.

Take the regional sales chart pulled randomly from a Web search above:

Speedometers_example

These charts are completely useless without the axis labels.

Besides, because the span of the axis isn't 0% to 100%, every tick mark must be labelled with the numeric value. That's a lot of extra ink used to display a single value!


The what of visualization, beyond the how

A long-time reader sent me the following chart from a Nature article, pointing out that it is rather worthless.

Nautre_scihub

The simple bar chart plots the number of downloads, organized by country, from the website called Sci-Hub, which I've just learned is where one can download scientific articles for free - working around the exorbitant paywalls of scientific journals.

The bar chart is a good example of a Type D chart (Trifecta Checkup). There is nothing wrong with the purpose or visual design of the chart. Nevertheless, the chart paints a misleading picture. The Nature article addresses several shortcomings of the data.

The first - and perhaps most significant - problem is that many Sci-Hub users are expected to access the site via VPN servers that hide their true countries of origin. If the proportion of VPN users is high, the entire dataset is called into doubt. The data would contain both false positives (in countries with VPN servers) and false negatives (in countries with high numbers of VPN users). 

The second problem is seasonality. The dataset covered only one month. Many users are expected to be academics, and in the southern hemisphere, schools are on summer vacation in January and February. Thus, the data from those regions may convey the wrong picture.

Another problem, according to the Nature article, is that Sci-Hub has many competitors. "The figures include only downloads from original Sci-Hub websites, not any replica or ‘mirror’ site, which can have high traffic in places where the original domain is banned."

This mirror-site problem may be worse than it appears. Yes, downloads from Sci-Hub underestimate the entire market for "free" scientific articles. But these mirror sites also inflate Sci-Hub statistics. Presumably, these mirror sites obtain their inventory from Sci-Hub by setting up accounts, thus contributing lots of downloads.

***

Even if VPN and seasonality problems are resolved, the total number of downloads should be adjusted for population. The most appropriate adjustment factor is the population of scientists, but that statistic may be difficult to obtain. A useful proxy might be the number of STEM degrees by country - obtained from a UNESCO survey (link).

A metric of the type "number of Sci-Hub downloads per STEM degree" sounds odd and useless. I'd argue it's better than the unadjusted total number of Sci-Hub downloads. Just don't focus on the absolute values but the relative comparisons between countries. Even better, we can convert the absolute values into an index to focus attention on comparisons.

 


Speaking to the choir

A friend found the following chart about the "carbon cycle", and sent me an exasperated note, having given up on figuring it out. The chart came from a report, and was reprinted in Ars Technica (link).

Gcp_s09_2021_global_perturbation-800x371

The problem with the chart is that the designer is speaking to the choir. One must know a lot about the carbon cycle already to make sense of everything that's going on.

We see big and small arrows pointing up or down. Each arrow has a number attached to it, plus a range inside brackets. These numbers have no units, and it's not obvious what they are measuring.

The arrows come in a variety of colors. The colors are explained by labels but the labels dexcribe apparently unrelated concepts (e.g. fossil CO2 and land-use change).

Interspersed with the arrows is a singular dot. The dot also has a number attached to it. The number wears a plus sign, which signals it's being treated differently than the quantities with up arrows.

The singular dot is an outcast, ostracized from the community of dots in the bottom part of the chart. These dots have labels but no numbers. They come in different sizes but no scale is provided.

The background is divided into three parts, showing the atmosphere, the land mass, and the ocean. The placement of the arrows and dots suggests each measured quantity concerns one of these three parts. Well... except the dot labeled "surface sediments" that sit on the boundary of the land mass and the ocean.

The three-way classification is only one layer of the chart. A different classification is embedded in the color scheme. The gray, light green, and aquamarine arrows in the sky find their counterparts in the dots of the land mass, and the ocean.

What's more, the boundaries between land and sky, and between land and ocean are also painted with those colors. These boundary segments have been given different colors so that the lengths of these segments seem to contain data but we aren't sure what.

At this point, I noticed thin arrows which appear to depict back and forth flows. There may be two types of such exchanges, one indicated by a cycle, the other by two straight arrows in opposite directions. The cycles have no numbers while each pair of straight thin arrows gets two numbers, always identical.

At the bottom of the chart is a annotation in red: "Budget imbalance = -1.0". Presumably some formula ties the numbers shown above to this -1.0 result. We still don't know the units, and it's unclear if -1.0 is a bad number. A negative number shown in red typically indicates a bad number but how bad is it?

Finally, on the top right corner, I found a legend. It's not obvious at first because the legend symbols (arrows and dots) are shown in gray, a color not used elsewhere on the chart. It appears as if it represents another color category. The legend labels do little for me. What is an "anthropogenic flux"? What does the unit of "GtCO2" stand for? Other jargon includes "carbon cycling" and "stocks". The entire diagram is titled "carbon cycle" while the "carbon cycling" thin arrows are only a small part of the diagram.

The bottom line is I have no idea what this chart is saying to me, other than that the earth is a complex system, and that the designer has tried valiantly to impregnate the diagram with lots of information. If I am well read in environmental science, my experience is likely different.

 

 

 

 

 


Illustrating coronavirus waves with moving images

The New York Times put out a master class in visualizing space and time data recently, in a visualization of five waves of Covid-19 that have torched the U.S. thus far (link).

Nyt_coronawaves_title

The project displays one dataset using three designs, which provides an opportunity to compare and contrast them.

***

The first design - above the headline - is an animated choropleth map. This is a straightforward presentation of space and time data. The level of cases in each county is indicated by color, dividing the country into 12 levels (plus unknown). Time is run forward. The time legend plays double duty as a line chart that shows the change in the weekly rate of reported cases over the course of the pandemic. A small piece of interactivity binds the legend with the map.

Nyt_coronawaves_moviefront

(To see a screen recording of the animation, click on the image above.)

***

The second design comprises six panels, snapshots that capture crucial "turning points" during the Covid-19 pandemic. The color of each county now encodes an average case rate (I hope they didn't just average the daily rates). 

Nyt_coronawaves_panelsix

The line-chart legend is gone -  it's not hard to see Winter > Fall 2020 > Summer/Fall 2021 >... so I don't think it's a big loss.

The small-multiples setup is particularly effective at facilitating comparisons: across time, and across space. It presents a story in pictures.

They may have left off 2020 following "Winter" because December to February spans both years but "Winter 2020" may do more benefit than harm here.

***

The third design is a series of short films, which stands mid-way between the single animated map and the six snapshots. Each movie covers a separate window of time.

This design does a better job telling the story within each time window while it obstructs comparisons across time windows.

Nyt_coronawaves_shortfilms

The informative legend is back. This time, it's showing the static time window for each map.

***

The three designs come from the same dataset. I think of them as one long movie, six snapshots, and five short films.

The one long movie is a like a data dump. It shows every number in the dataset, which is the weekly case rate for each county for a given week. All the data are streamed into a single map. It's a show piece.

As an instrument to help readers understand the patterns in the dataset, the movie falls short. Too much is going on, making it hard to focus and pick out key trends. When your eyes are everywhere, they are nowhere.

The six snapshots represent the other extreme. The graph does not move, as the time axis is reduced to six discrete time points. But this display describes the change points, and tells a story. The long movie, by contrast, invites readers to find a story.

Without motion, the small-multiples format allows us to pick out specific counties or regions and compare the case rates across time. This task is close to impossible in the long movie, as it requires freezing the movie, and jumping back and forth.

The five short films may be the best of both worlds. It retains the motion. If the time windows are chosen wisely, each short film contains a few simple patterns that can easily be discerned. For example, the third film shows how the winter wave emerged from the midwest and then walloped the whole country, spreading southward and toward the coasts.

Nyt_winterwave

(If the above gif doesn't play, click it.)

***

If there is double or triple the time allocated to this project, I'd want to explore spatial clustering. I'd like to dampen the spatial noise (neighboring counties that have slightly different experiences). There is also temporal noise (fluctuations from week to week for the same county) - which can be smoothed away. I think with these statistical techniques, the "wave" feature of the pandemic may be more visible.

 

 


Visually displaying multipliers

As I'm preparing a blog about another real-world study of Covid-19 vaccines, I came across the following chart (the chart title is mine).

React1_original

As background, this is the trend in Covid-19 cases in the U.K. in the last couple of months, courtesy of OurWorldinData.org.

Junkcharts_owid_uk_case_trend_july_august_2021

The React-1 Study sends swab kits to randomly selected people in England in order to assess the prevalence of Covid-19. Every month, there is a new round of returned swabs that are tested for Covid-19. This measurement method captures asymptomatic cases although it probably missed severe and hospitalized cases. Despite having some shortcomings, this is a far better way to measure cases than the hotch-potch assembling of variable-quality data submitted by different jurisdictions that has become the dominant source of our data.

Rounds 12 and 13 captured an inflection point in the pandemic in England. The period marked the beginning of the end of the belief that widespread vaccination will end the pandemic.

The chart I excerpted up top broke the data down by age groups. The column heights represent the estimated prevalence of Covid-19 during each round - also, described precisely in the paper as "swab positivity." Based on the study's design, one may generalize the prevalence to the population at large. About 1.5% of those aged 13-24 in England are estimated to have Covid-19 around the time of Round 13 (roughly early July).

The researchers came to the following conclusion:

We show that the third wave of infections in England was being driven primarily by the Delta variant in younger, unvaccinated people. This focus of infection offers considerable scope for interventions to reduce transmission among younger people, with knock-on benefits across the entire population... In our data, the highest prevalence of infection was among 12 to 24 year olds, raising the prospect that vaccinating more of this group by extending the UK programme to those aged 12 to 17 years could substantially reduce transmission potential in the autumn when levels of social mixing increase

***

Raise your hand if the graphics software you prefer dictates at least one default behavior you can't stand. I'm sure most hands are up in the air. No matter how much you love the software, there is always something the developer likes that you don't.

The first thing I did with today's chart is to get rid of all such default details.

Redo_react1_cleanup

For me, the bottom chart is cleaner and more inviting.

***

The researchers wanted readers to think in terms of Round 3 numbers as multiples of Round 2 numbers. In the text, they use statements such as:

weighted prevalence in round 13 was nine-fold higher in 13-17 year olds at 1.56% (1.25%, 1.95%) compared with 0.16% (0.08%, 0.31%) in round 12

It's not easy to perceive a nine-fold jump from the paired column chart, even though this chart form is better than several others. I added some subtle divisions inside each orange column in order to facilitate this task:

Redo_react1_multiples

I have recommended this before. I'm co-opting pictograms in constructing the column chart.

An alternative is to plot everything on an index scale although one would have to drop the prevalence numbers.

***

The chart requires an additional piece of context to interpret properly. I added each age group's share of the population below the chart - just to illustrate this point, not to recommend it as a best practice.

Redo_react1_multiples_popshare

The researchers concluded that their data supported vaccinating 13-17 year olds because that group experienced the highest multiplier from Round 12 to Round 13. Notice that the 13-17 year old age group represents only 6 percent of England's population, and is the least populous age group shown on the chart.

The neighboring 18-24 age group experienced a 4.5 times jump in prevalence in Round 13 so this age group is doing much better than 13-17 year olds, right? Not really.

While the same infection rate was found in both age groups during this period, the slightly older age group accounted for 50% more cases -- and that's due to the larger share of population.

A similar calculation shows that while the infection rate of people under 24 is about 3 times higher than that of those 25 and over, both age groups suffered over 175,000 infections during the Round 3 time period (the difference between groups was < 4,000).  So I don't agree that focusing on 13-17 year olds gives England the biggest bang for the buck: while they are the most likely to get infected, their cases account for only 14% of all infections. Almost half of the infections are in people 25 and over.

 


Check your presumptions while you're reading this chart about Israel's vaccination campaign

On July 30, Israel began administering third doses of mRNA vaccines to targeted groups of people. This decision was controversial since there is no science to support it. The policymakers do have educated guesses by experts based on best-available information. By science, I mean actual evidence. Since no one has previously been given three shots, there can be no data on which anyone can root such a decision. Nevertheless, the pandemic does not always give us time to collect relevant data, and so speculative analysis has found its calling.

Dvir Aran, at Technion, has been diligently tracking the situation in Israel on his Twitter. Ten days after July 30, he posted the following chart, which immediately led many commentators to bounce out of their seats crowning the third shot as a magic bullet. Notably, Dvir himself did not endorse such a claim. (See here to learn how other hasty conclusions by experts have fared.)

When you look at Dvir's chart, what do we see?

Dvir_aran_chart

Possibly one of the following two things, depending on what concern you have in your head.

1) The red line sits far above the other two lines, showing that unvaccinated people are much more likely to get infected.

2) The blue line diverges from the green line almost immediately after the 3rd shots started getting into arms, showing that the 3rd shot is super effective.

If you take another moment to look, you might start asking questions, as many in Twitter world did. Dvir was startlingly efficient at answering these queries.

A) Does the green line represent people with 2 or 3 doses, or is it strictly 2 doses? Aron asked this question and got the answer (the former):

AronBrand_israelcases_twoorthreedoses

It's time to check our presumptions. When you read that chart, did you presume it's exactly 2 doses or did you presume it's 2 or 3 doses? Or did you immediately spot the ambiguity? As I said in this article, graphs attain efficiency at communication because the designer leverages unspoken rules - the chart conveys certain information without explicitly placing it on the chart. But this can backfire. In this case, I presumed the three lines to display three non-overlapping groups of people, and thus the green line indicates those with 2 doses but not 3. That presumption led me to misinterpret what's on the chart.

B) What is the denominator of the case rates? Is it literal - by that I mean, all unvaccinated people for the red line, and all people with 3 doses for the blue line? Or is the denominator the population of Israel, the same number for all three lines? Lukas asked this question, and got the answer (the former).

Lukas_denominator

C) Since third shots are recommended for 60 year olds and over who were vaccinated at least 5 months ago, and most unvaccinated Israelis are below 60, this answer opens the possibility that the lines compare apples and oranges. Joe. S. asked about this, and received an answer (all lines display only 60 year olds and over.)

Joescholar_basepopulationquestion

Jason P. asked, and learned that the 5-month-out criterion is immaterial since 90% of the vaccinated have already reached that time point.

JasonPogue_5monthsout

D) We have even more presumptions. Like me, did you presume that the red line represents the "unvaccinated," meaning people who have not had any vaccine shots? If so, we may both be wrong about this. It has become the norm by vaccine researchers to lump "partially vaccinated" people with "unvaccinated", and call this combined group "unvaccinated". Here is an excerpt from a recent report from Public Health Ontario (link to PDF), which clearly states this unintuitive counting rule:

Ontario_case_definition

Notice that in this definition, someone who got infected within 14 days of the first shot is classified as an "unvaccinated" case and not a "partially vaccinated case".

In the following tweet, Dvir gave a hint of what he plotted:

Dvir_group_definition

In a previous analysis, he averaged the rates of people with 0 doses and 1 dose, which is equivalent to combining them and calling them unvaccinated. It's unclear to me what he did to the 1-dose subgroup in our featured chart - did it just vanish from the chart? (How people and cases are classified into these groups is a major factor in all vaccine effectiveness calculations - a topic I covered here. Unfortunately, most published reports do a poor job explaining what the analysts did).

E) Did you presume that all three lines are equally important? That's far from true. Since Israel is the world champion in vaccination, the bulk of the 60+ population form the green line. I asked Dvir and he responded that only 7.5%, or roughly 100K are unvaccinated.

DvirAran_proportionofunvaccinated

That means 1.2 million people are part of the green line, 12 times higher. There are roughly 50 cases per day among unvaccinated, and 370 daily cases among those with 2 or 3 doses. In other words, vaccinated people account for almost 90% of all cases.

Yes, this is inevitable when over 90% of the age group have been vaccinated (but it is predictable on the first day someone blasted everywhere that real-world VE is proved by the fact that almost all new cases were in the unvaccinated.)

If your job is to minimize infections, you should be spending most of your time thinking about the 370 cases among vaccinated than the 50 cases among unvaccinated. If you halve the case rate, that would be a difference of 185 cases vs 25. In Israel, the vaccination campaign has already succeeded; it's time to look forward, which is exactly why they are re-focusing on the already vaccinated.

***

If what you worry about most is the effectiveness of the original two-dose regimen, Dvir's chart raises a puzzle. Ignore the blue line, and remember that the green line already includes everybody represented by the blue line.

In the following chart, I removed the blue line, and added reference lines in dashed purple that correspond to 25%, 50% and 75% vaccine effectiveness. The data plotted on this chart are unadjusted case rates. A 75% effective vaccine cuts case rate by three quarters.

Junkcharts_dviraran_israel_threeshotschart

This chart shows the 2-dose mRNA vaccine was nowhere near 90% effective. (As regular readers know, I don't endorse this simplistic calculation and have outlined the problems here, but this style of calculation keeps getting published and passed around. Those who use it to claim real-world studies confirm prior clinical trial outcomes can either (a) insist on using it and retract their earlier conclusions, or (b) admit that such a calculation was, and is, a bad take.)

Also observe how the vaccinated (green) line is moving away from the unvaccinated (red) line. The vaccine apparently is becoming more effective, which runs counter to the trend used by the Israeli government to justify third doses. This improvement also precedes the start of the third-shot campaign. When the analytical method is bad, it generates all sorts of spurious findings.

***

As Dvir said, it is premature to comment on the third doses based on 10 days of data. For one thing, the vaccine developers insist that their vaccines must be given 14 days to work. In a typical calculation, all of the cases in the blue line fall outside the case-counting window. The effective number of cases that would be attributed to the 3-dose group right now is zero, and the vaccine effectiveness using the standard methodology is 100%, even better than shown in the chart.

There is an alternative interpretation of this graph. Statisticians call this the selection effect. On July 30, the blue line split out of the green: some people were selected to receive the 3rd dose - this includes an official selection (the government makes certain subgroups eligible) as well as a self-selection (within the eligible subgroup, certain people decide to get the 3rd shot earlier.) If those who are less exposed to the virus, or more risk averse, get the shots first, then all that is happening may be that we have split off a high VE subgroup from the green line. Even if the third shot were useless, the selection effect itself could explain the gap.

Statistics is about grays. It's not either-or. It's usually some of each. If you feel like Groundhog Day, you're getting the picture. When they rolled out two doses, we lived through an optimistic period in which most experts rejoiced about 90-100% real-world effectiveness, and then as more people get vaccinated, the effect washed away. The selection effect gradually disappears when vaccination becomes widespread. Are we starting a new cycle of hope and despair? We'll find out soon enough.


What metaphors give, they take away

Aleks pointed me to the following graphic making the rounds on Twitter:

Whyaxis_covid_men

It's being passed around as an example of great dataviz.

The entire attraction rests on a risque metaphor. The designer is illustrating a claim that Covid-19 causes erectile dysfunction in men.

That's a well-formed question so in using the Trifecta Checkup, that's a pass on the Q corner.

What about the visual metaphor? I advise people to think twice before using metaphors because these devices can give as they can take. This example is no exception. Some readers may pay attention to the orientation but other readers may focus on the size.

I pulled out the tape measure. Here's what I found.

Junkcharts_covid_eds

The angle is accurate on the first chart but the diameter has been exaggerated relative to the other. The angle is slightly magnified in the bottom chart which has a smaller circumference.

***

Let's look at the Data to round out our analysis. They come from a study from Italy (link), utilizing survey responses. There were 25 male respondents in the survey who self-reported having had Covid-19. Seven of these submitted answers to a set of five questions that were "suggestive of erectile dysfunction". (This isn't as arbitrary as it sounds - apparently it is an internationally accepted way of conducting reseach.) Seven out of 25 is 28 percent. Because the sample size is small, the 95% confidence range is 10% to 46%.

The researchers then used the propensity scoring method to find 3 matches per each infected person. Each match is a survey respondent who did not self-report having had Covid-19. See this post about a real-world vaccine study to learn more about propensity scoring. Among the 75 non-infected men, 7 were judged to have ED. The 95% range is 3% to 16%.

The difference between the two subgroups is quite large. The paper also includes other research that investigates the mechanisms that can explain the observed correlation. Nevertheless, the two proportions depicted in the chart have wide error bars around them.

I have always had a question about analysis using this type of survey data (including my own work). How do they know that ED follows infection rather than precedes it? One of the inviolable rules of causation is that the effect follows the cause. If it's a series of surveys, the sequencing may be measurable but a single survey presents challenges. 

The headline of the dataviz is "Get your vaccines". This comes from a "story time" moment in the paper. On page 1, under Discussion and conclusion, they inserted the sentence "Universal vaccination against COVID-19 and the personal protective equipment could possibly have the added benefit of preventing sexual dysfunctions." Nothing in the research actually supports this claim. The only time the word "vaccine" appears in the entire paper is on that first page.

"Story time" is the moment in a scientific paper when the researchers - after lulling readers to sleep over some interesting data - roll out statements that are not supported by the data presented before.

***

The graph succeeds in catching people's attention. The visual metaphor works in one sense but not in a different sense.

 

P.S. [8/6/2021] One final note for those who do care about the science: the internet survey not surprisingly has a youth bias. The median age of 25 infected people was 39, maxing out at 45 while the median of the 75 not infected was 42, maxing out at 49.


Vaccine researchers discard the start-at-zero rule

I struggled to decide on which blog to put this post. The reality is it bridges the graphical and analytical sides of me. But I ultimately placed it on the dataviz blog because that's where today's story starts.

Data visualization has few set-in-stone rules. If pressed for one, I'd likely cite the "start-at-zero" rule, which has featured regularly on Junk Charts (here, here, and here, for example). This rule only applies to a bar chart, where the heights (and thus, areas) of the bars should encode the data.

Here is a stacked column chart that earns boos from us:

Kfung_stackedcolumn_notstartingatzero_0

I made it so I'm downvoting myself. What's wrong with this chart? The vertical axis starts at 42 instead of zero. I've cropped out exactly 42 units from each column. Therefore, the column areas are no longer proportional to the ratio of the data. Forty-two is 84% of the column A while it is 19% of column B. By shifting the x-axis, I've made column B dwarf column A. For comparison, I added a second chart that has the x-axis start at zero.

Kfung_stackedcolumn_notstartatzero

On the right side, Column B is 22 times the height of column A. On the left side, it is 4 times as high. Both are really the same chart, except one has its legs chopped off.

***

Now, let me reveal the data behind the above chart. It is a re-imagination of the famous cumulative case curve from the Pfizer vaccine trial.

Pfizerfda_figure2_cumincidencecurves

I transferred the data to a stacked column chart. Each column block shows the incremental cases observed in a given week of the trial. All the blocks stacked together rise to the total number of cases observed by the time the interim analysis was presented to the FDA.

Observe that in the cumulative cases chart, the count starts at zero on Day 0 (first dose). This means the chart corresponds to the good stacked column chart, with the x-axis starting from zero on Day 0.

Kfung_pfizercumcases_stackedcolumn

The Pfizer chart above is, however, disconnected from the oft-chanted 95% vaccine efficacy number. You can't find this number on there. Yes, everyone has been lying to you. In a previous post, I did the math, and if you trace the vaccine efficacy throughout the trial, you end up at about 80% toward the right, not 95%.

Pfizer_cumcases_ve_vsc_published

How can they conclude VE is 95% but show a chart that never reaches that level? The chart was created for a "secondary" analysis included in the report for completeness. The FDA and researchers have long ago decided, before the trials started enrolling people, that they don't care about the cumulative case curve starting on Day 0. The "primary" analysis counts cases starting 7 days after the second shot, which means Day 29.

The first week that concerns the FDA is Days 29-35 (for Pfizer's vaccine). The vaccine arm saw 41 cases in the first 28 days of the trial. In effect, the experts chop the knees off the column chart. When they talk about 95% VE, they are looking at the column chart with the axis starting at 42.

Kfung_pfizercumcases_stackedcolumn_chopped

Yes, that deserves a boo.

***

It's actually even worse than that, if you could believe it.

The most commonly cited excuse for the knee-chop is that any vaccine is expected to be useless in the first X days (X being determined after the trial ends when they analyze the data). A recently published "real world" analysis of the situation in Israel contains a lengthy defense of this tactic, in which they state:

Strictly speaking, the vaccine effectiveness based on this risk ratio overestimates the overall vaccine effectiveness in our study because it does not include the early follow-up period during which the vaccine has no detectable effect (and thus during which the ratio is 1). [Appendix, Supplement 4]

Assuming VE = 0 prior to day X is equivalent to stipulating that the number of cases found in the vaccine arm is the same (within margin of error) as the number of cases in the placebo arm during the first X days.

That assumption is refuted by the Pfizer trial (and every other trial that has results so far.)

The Pfizer/Biontech vaccine was not useless during the first week. It's not 95% efficacious, more like 16%. In the second week, it improves to 33%, and so on. (See the VE curve I plotted above for the Pfizer trial.)

What happened was all the weeks before which the VE has not plateaued were dropped.

***

So I was simplifying the picture by chopping same-size blocks from both columns in the stacked column chart. Contrary to the no-effect assumption, the blocks at the bottom of each column are of different sizes. Much more was chopped from the placebo arm than from the vaccine arm.

You'd think that would unjustifiably favor the placebo. Not true! As almost all the cases on the vaccine arm were removed, the remaining cases on the placebo arm are now many multiples of those on the vaccine arm.

The following shows what the VE would have been reported if they had started counting cases from day X. The first chart counts all cases from first shot. The second chart removes the first two weeks of cases, corresponding to the analysis that other pharmas have done, namely, evaluate efficacy from 14 days after the first dose. The third chart removes even more cases, and represents what happens if the analysis is conducted from second dose. The fourth chart is the official Pfizer analysis, which began days after the second shot. Finally, the fifth chart shows analysis begining from 14 days after the second shot, the window selected by Moderna and Astrazeneca.

Kfung_howvaccinetrialsanalyzethedata

The premise that any vaccine is completely useless for a period after administration is refuted by the actual data. By starting analysis windows at some arbitrary time, the researchers make it unnecessarily difficult to compare trials. Selecting the time of analysis based on the results of a single trial is the kind of post-hoc analysis that statisticians have long warned leads to over-estimation. It's equivalent to making the vertical axis of a column chart start above zero in order to exaggerate the relative heights of the columns.

 

P.S. [3/1/2021] See comment below. I'm not suggesting vaccines are useless. They are still a miracle of science. I believe the desire to report a 90% VE number is counterproductive. I don't understand why a 70% or 80% effective vaccine is shameful. I really don't.


Illustrating differential growth rates

Reader Mirko was concerned about a video published in Germany that shows why the new coronavirus variant is dangerous. He helpfully provided a summary of the transcript:

The South African and the British mutations of the SARS-COV-2 virus are spreading faster than the original virus. On average, one infected person infects more people than before. Researchers believe the new variant is 50 to 70 % more transmissible.

Here are two key moments in the video:

Germanvid_newvariant1

This seems to be saying the original virus (left side) replicates 3 times inside the infected person while the new variant (right side) replicates 19 times. So we have a roughly 6-fold jump in viral replication.

Germanvid_newvariant2

Later in the video, it appears that every replicate of the old virus finds a new victim while the 19 replicates of the new variant land on 13 new people, meaning 6 replicates didn't find a host.

As Mirko pointed out, the visual appears to have run away from the data. (In our Trifecta Checkup, we have a problem with the arrow between the D and the V corners. What the visual is saying is not aligned with what the data are saying.)

***

It turns out that the scientists have been very confusing when talking about the infectiousness of this new variant. The most quoted line is that the British variant is "50 to 70 percent more transmissible". At first, I thought this is a comment on the famous "R number". Since the R number around December was roughly 1 in the U.K, the new variant might bring the R number up to 1.7.

However, that is not the case. From this article, it appears that being 5o to 70 percent more transmissible means R goes up from 1 to 1.4. R is interpreted as the average number of people infected by one infected person.

Mirko wonders if there is a better way to illustrate this. I'm sure there are many better ways. Here's one I whipped up:

Junkcharts_redo_germanvideo_newvariant

The left side is for the 40% higher R number. Both sides start at the center with 10 infected people. At each time step, if R=1 (right side), each of the 10 people infects 10 others, so the total infections increase by 10 per time step. It's immediately obvious that a 40% higher R is very serious indeed. Starting with 10 infected people, in 10 steps, the total number of infections is almost 1,000, almost 10 times higher than when R is 1.

The lines of the graphs simulate the transmission chains. These are "average" transmission chains since R is an average number.

 

P.S. [1/29/2021: Added the missing link to the article in which it is reported that 50-70 percent more transmissible implies R increasing by 40%.]