Vaccine researchers discard the start-at-zero rule

I struggled to decide on which blog to put this post. The reality is it bridges the graphical and analytical sides of me. But I ultimately placed it on the dataviz blog because that's where today's story starts.

Data visualization has few set-in-stone rules. If pressed for one, I'd likely cite the "start-at-zero" rule, which has featured regularly on Junk Charts (here, here, and here, for example). This rule only applies to a bar chart, where the heights (and thus, areas) of the bars should encode the data.

Here is a stacked column chart that earns boos from us:

Kfung_stackedcolumn_notstartingatzero_0

I made it so I'm downvoting myself. What's wrong with this chart? The vertical axis starts at 42 instead of zero. I've cropped out exactly 42 units from each column. Therefore, the column areas are no longer proportional to the ratio of the data. Forty-two is 84% of the column A while it is 19% of column B. By shifting the x-axis, I've made column B dwarf column A. For comparison, I added a second chart that has the x-axis start at zero.

Kfung_stackedcolumn_notstartatzero

On the right side, Column B is 22 times the height of column A. On the left side, it is 4 times as high. Both are really the same chart, except one has its legs chopped off.

***

Now, let me reveal the data behind the above chart. It is a re-imagination of the famous cumulative case curve from the Pfizer vaccine trial.

Pfizerfda_figure2_cumincidencecurves

I transferred the data to a stacked column chart. Each column block shows the incremental cases observed in a given week of the trial. All the blocks stacked together rise to the total number of cases observed by the time the interim analysis was presented to the FDA.

Observe that in the cumulative cases chart, the count starts at zero on Day 0 (first dose). This means the chart corresponds to the good stacked column chart, with the x-axis starting from zero on Day 0.

Kfung_pfizercumcases_stackedcolumn

The Pfizer chart above is, however, disconnected from the oft-chanted 95% vaccine efficacy number. You can't find this number on there. Yes, everyone has been lying to you. In a previous post, I did the math, and if you trace the vaccine efficacy throughout the trial, you end up at about 80% toward the right, not 95%.

Pfizer_cumcases_ve_vsc_published

How can they conclude VE is 95% but show a chart that never reaches that level? The chart was created for a "secondary" analysis included in the report for completeness. The FDA and researchers have long ago decided, before the trials started enrolling people, that they don't care about the cumulative case curve starting on Day 0. The "primary" analysis counts cases starting 7 days after the second shot, which means Day 29.

The first week that concerns the FDA is Days 29-35 (for Pfizer's vaccine). The vaccine arm saw 41 cases in the first 28 days of the trial. In effect, the experts chop the knees off the column chart. When they talk about 95% VE, they are looking at the column chart with the axis starting at 42.

Kfung_pfizercumcases_stackedcolumn_chopped

Yes, that deserves a boo.

***

It's actually even worse than that, if you could believe it.

The most commonly cited excuse for the knee-chop is that any vaccine is expected to be useless in the first X days (X being determined after the trial ends when they analyze the data). A recently published "real world" analysis of the situation in Israel contains a lengthy defense of this tactic, in which they state:

Strictly speaking, the vaccine effectiveness based on this risk ratio overestimates the overall vaccine effectiveness in our study because it does not include the early follow-up period during which the vaccine has no detectable effect (and thus during which the ratio is 1). [Appendix, Supplement 4]

Assuming VE = 0 prior to day X is equivalent to stipulating that the number of cases found in the vaccine arm is the same (within margin of error) as the number of cases in the placebo arm during the first X days.

That assumption is refuted by the Pfizer trial (and every other trial that has results so far.)

The Pfizer/Biontech vaccine was not useless during the first week. It's not 95% efficacious, more like 16%. In the second week, it improves to 33%, and so on. (See the VE curve I plotted above for the Pfizer trial.)

What happened was all the weeks before which the VE has not plateaued were dropped.

***

So I was simplifying the picture by chopping same-size blocks from both columns in the stacked column chart. Contrary to the no-effect assumption, the blocks at the bottom of each column are of different sizes. Much more was chopped from the placebo arm than from the vaccine arm.

You'd think that would unjustifiably favor the placebo. Not true! As almost all the cases on the vaccine arm were removed, the remaining cases on the placebo arm are now many multiples of those on the vaccine arm.

The following shows what the VE would have been reported if they had started counting cases from day X. The first chart counts all cases from first shot. The second chart removes the first two weeks of cases, corresponding to the analysis that other pharmas have done, namely, evaluate efficacy from 14 days after the first dose. The third chart removes even more cases, and represents what happens if the analysis is conducted from second dose. The fourth chart is the official Pfizer analysis, which began days after the second shot. Finally, the fifth chart shows analysis begining from 14 days after the second shot, the window selected by Moderna and Astrazeneca.

Kfung_howvaccinetrialsanalyzethedata

The premise that any vaccine is completely useless for a period after administration is refuted by the actual data. By starting analysis windows at some arbitrary time, the researchers make it unnecessarily difficult to compare trials. Selecting the time of analysis based on the results of a single trial is the kind of post-hoc analysis that statisticians have long warned leads to over-estimation. It's equivalent to making the vertical axis of a column chart start above zero in order to exaggerate the relative heights of the columns.

 

P.S. [3/1/2021] See comment below. I'm not suggesting vaccines are useless. They are still a miracle of science. I believe the desire to report a 90% VE number is counterproductive. I don't understand why a 70% or 80% effective vaccine is shameful. I really don't.


A note to science journal editors: require better visuals

In reviewing a new small-scale study of the Moderna vaccine, I found this chart:

Modernahalfdoses_fig3a

This style of charts is quite common in scientific papers. And they are horrible. It irks me to think that some authors are forced to adopt such styles.

The study's main goal is to compare two half doses to two full doses of the Moderna vaccine. (To understand the science, read the post on my book blog.) The participants were stratified by age group. The vaccine is expected to work better for younger people than for older people. The point of the study isn't to measure the difference by age group, and so the age-group dimension is secondary.

Upon recognizing that, I reduce the number of colors from 4 to 2:

Junkcharts_redo_modernahalfdoses_1

Halving the number of colors presents no additional difficulty. The reader spends less time cross-referencing.

The existence of the Pbo (placebo) and Conv (convalescent plasma) columns on the sides is both unsightly and suboptimal. The "Conv" serves as a reference level for the amount of antibodies the vaccine stimulates in people. A better way to display reference levels is using reference lines.

Junkcharts_redo_modernahalfdoses_2color

The biggest problem with the chart is the log scale on the vertical axis. This isn't even a log-10 but a log-2. (Each tick is a doubling of value.)

Take the first set of columns as an example. The second column is clearly less than twice the height of the first column, and yet 25 is 3.5 times bigger than 7.  The third column is also visually less than double the size of the second column, and yet 189 is 7.5 times bigger than 25. The areas (heights) of the columns do not convey the right information about relative sizes of the underlying data.

Here's an amusing observation. The brown area shaded below is half of the entire area of the chart - if we reverted it to a linear scale. And yet there is not a single data point above 250 in the data so the brown area is entirely empty.

Junkcharts_redo_modernahalfdoses_logscale

An effect of a log scale is to compress the larger values of a dataset. That's what you're seeing here.

I now revisualize using dotplots:

Junkcharts_redo_modernahalfdoses_dotplotlinear

The version on the left retains the log scale while the right one (pun intended) reverts to the linear scale.

The biggest effect by far is the spike of antibodies between day 29 and 43 - which is after the second shot is administered. (For Moderna, the second shot is targeted for day 28.) In fact, it is during that window that the level of antibodies went from below the "conv" level (i.e. from natural infection) to far above.

The log-scale version buries this finding because it squeezes the large numbers on the chart. In addition, it artificially pulls the small numbers toward the "Conv" level. On the right chart, the second dot for 18-54, full doses is only at half the level of "Conv"  but it looks tantalizing close to the "Conv" level on the left chart.

The authors of the study also claim that there is negligible dropoff by 30 days after the second dose, i.e. between the third and fourth dots in each set. That may be so on the log-scale chart but on the linear chart, we see a moderate reduction. I don't believe the size of this study allows us to make a stronger conclusion but the claim of no dropoff is dubious.

The left chart also obscures the age-group differences. It appears as if all four sets show roughly the same pattern. With the linear scale, we notice that the vaccine clearly works better for the younger subgroup. As I discussed on the book blog, no one actually knows what level of antibodies constitutes "protection," and so I can't say whether that age-group difference has practical significance.

***

I recommend using log scales sparingly and carefully. They are a source of much mischief and misadventure.

 

 

 


Same data + same chart form = same story. Maybe.

We love charts that tell stories.

Some people believe that if they situate the data in the right chart form, the stories reveal themselves.

Some people believe for a given dataset, there exists a best chart form that brings out the story.

An implication of these beliefs is that the story is immutable, given the dataset and the chart form.

If you use the Trifecta Checkup, you already know I don't subscribe to those ideas. That's why the Trifecta has three legs, the third is the question - which is related to the message or the story.

***

I came across the following chart by Statista, illustrating the growth in Covid-19 cases from the start of the pandemic to this month. The underlying data are collected by WHO and cover the entire globe. The data are grouped by regions.

Statista_avgnewcases

The story of this chart appears to be that the world moves in lock step, with each region behaving more or less the same.

If you visit the WHO site, they show a similar chart:

WHO_horizontal_casesbyregion

On this chart, the regions at the bottom of the graph (esp. Southeast Asia in purple) clearly do not follow the same time patterns as Americas (orange) or Europe (green).

What we're witnessing is: same data, same chart form, different stories.

This is a feature, not a bug, of the stacked area chart. The story is driven largely by the order in which the pieces are stacked. In the Statista chart, the largest pieces are placed at the bottom while for WHO, the order is exactly reversed.

(There are minor differences which do not affect my argument. The WHO chart omits the "Other" category which accounts for very little. Also, the Statista chart shows the smoothed data using 7-day averaging.)

In this example, the order chosen by WHO preserves the story while the order chosen by Statista wipes it out.

***

What might be the underlying question of someone who makes this graph? Perhaps it is to identify the relative prevalence of Covid-19 in different regions at different stages of the pandemic.

Emphasis on the word "relative". Instead of plotting absolute number of cases, I consider plotting relative number of cases, that is to say, the proportion of cases in each region at given times.

This leads to a stacked area percentage chart.

Junkcharts_redo_statistawho_covidregional

In this side-by-side view, you see that this form is not affected by flipping the order of the regions. Both charts say the same thing: that there were two waves in Europe and the Americas that dwarfed all other regions.

 

 


Making graphics last over time

Yesterday, I analyzed the data visualization by the White House showing the progress of U.S. Covid-19 vaccinations. Here is the chart.

Whgov_proportiongettingvaccinated

John who tweeted this at me, saying "please get a better data viz".

I'm happy to work with them or the CDC on better dataviz. Here's an example of what I do.

Junkcharts_redo_whgov_usvaccineprogress

Obviously, I'm using made-up data here and this is a sketch. I want to design a chart that can be updated continuously, as data accumulate. That's one of the shortcomings of that bubble format they used.

In earlier months, the chart can be clipped to just the lower left corner.

Junkcharts_redo_whgov_usvaccineprogress_2