Tongue in cheek but a master stroke

Andrew jumped on the Benford bandwagon to do a tongue-in-cheek analysis of numbers in Hollywood movies (link). The key graphic is this:

Gelman_hollywood_benford_2-1024x683

Benford's Law is frequently invoked to prove (or disprove) fraud with numbers by examining the distribution of first digits. Andrew extracted movies that contain numbers in their names - mostly but not always sequences of movies with sequels. The above histogram (gray columns) are the number of movies with specific first digits. The red line is the expected number if Benford's Law holds. As typical of such analysis, the histogram is closely aligned with the red line, and therefore, he did not find any fraud. 

I'll blog about my reservations about Benford-style analysis on the book blog later - one quick point is: as with any statistical analysis, we should say there is no statistical evidence of fraud (more precisely, of the kind of fraud that can be discovered using Benford's Law), which is different from saying there is no fraud.

***

Andrew also showed a small-multiples chart that breaks up the above chart by movie groups. I excerpted the top left section of the chart below:

Gelman_smallmultiples_benford

The genius in this graphic is easily missed.

Notice that the red lines (which are the expected values if Benford Law holds) appear identical on every single plot. And then notice that the lines don't represent the same values.

It's great to have the red lines look the same everywhere because they represent the immutable Benford reference. Because the number of movies is so small, he's plotting counts instead of proportions. If you let the software decide on the best y-axis range for each plot, the red lines will look different on different charts!

You can find the trick in the R code from Gelman's blog.

First, the maximum value of each plot is set to the total number of observations. Then, the expected Benford proportions are converted into expected Benford counts. The first Benford count is then shown against an axis topping out at the total count, and thus, relatively, what we are seeing are the Benford proportions. Thus, every red line looks the same despite holding different values.

This is a master stroke.

 

 

 


Working hard at clarity

As I am preparing another blog post about the pandemic, I came across the following data graphic, recently produced by the CDC for a vaccine advisory board meeting:

CDC_positivevaccineintent

This is not an example of effective visual communications.

***

For one thing, readers are directed to scour the footnotes to figure out what's going on. If we ignore those for the moment, we see clusters of bubbles that have remained pretty stable from December 2020 to August 2021. The data concern some measure of Americans' intent to take the COVID-19 vaccine. That much we know.

There may have been a bit of an upward trend between January and May, although if you were shown the clusters for December, February and April, you'd think the trend's been pretty flat. 

***

But those colors? What could they represent? You'd surely have to fish this one out of the footnotes. Specifically, this obtuse sentence: "Surveys with multiple time points are shown with the same color bubble for each time point." I had to read it several times. I think it simply means "Color represents the pollster." 

Then it adds: "Surveys with only one time point are shown in gray." which simply means "All pollsters who have only one entry in the dataset are grouped together and shown in gray."

Another problem with this chart is over-plotting. Look at the July cluster. It's impossible to tell how many polls were conducted in July because the circles pile on top of one another. 

***

The appearance of the flat trend is a result of two unfortunate decisions made by the designer. If I retained the chart form, I'd have produced something that looks like this:

Junkcharts_redo_cdcvaccineintent_sameform

The first design choice is to expand the vertical axis to range from 0% to 100%. This effectively squeezes all the bubbles into a small range.

Junkcharts_redo_cdcvaccineintent_startatzero

The second design choice is to enlarge the bubbles causing copious amount of overlapping. 

Junkcharts_redo_cdcvaccineintent_startatzero_bigdots

In particular, this decision blows up the Pew poll (big pink bubble) that contained 10 times the sample size of most of the other polls. The Pew outcome actually came in at 70% but the top of the pink bubble extends to over 80%. Because of this, the outlier poll of December 2020 - which surprisingly printed the highest number of all polls in the entire time window - no longer looks special. 

***

Now, let's see what else we can do to enhance this chart. 

I don't like how bubble size is used to encode the sample size. It creates a weird sensation for anyone who's familiar with sampling errors, and confidence regions. The Pew poll with 10 times the sample size is the most reliable poll of them all. Reliability means the error bars around the Pew poll outcome is the smallest of them all. I tend to think of the area around a point estimate as showing the sampling error so the Pew poll would be a dot, showing the high precision of that estimate. 

But that won't work because larger bubbles catch more of the reader's attention. So, in the following version, all dots have the same size. I encode reliability in the opacity of the color. The darker dots are polls that are more reliable, that have larger sample sizes.

Junkcharts_redo_cdcvaccineintent_opacity

Two of the pollsters have more frequent polling than others. In this next version, I highlighted those two, which reveals the trend better.

Junkcharts_redo_cdcvaccineintent_opacitywithlines

 

 

 


Check your presumptions while you're reading this chart about Israel's vaccination campaign

On July 30, Israel began administering third doses of mRNA vaccines to targeted groups of people. This decision was controversial since there is no science to support it. The policymakers do have educated guesses by experts based on best-available information. By science, I mean actual evidence. Since no one has previously been given three shots, there can be no data on which anyone can root such a decision. Nevertheless, the pandemic does not always give us time to collect relevant data, and so speculative analysis has found its calling.

Dvir Aran, at Technion, has been diligently tracking the situation in Israel on his Twitter. Ten days after July 30, he posted the following chart, which immediately led many commentators to bounce out of their seats crowning the third shot as a magic bullet. Notably, Dvir himself did not endorse such a claim. (See here to learn how other hasty conclusions by experts have fared.)

When you look at Dvir's chart, what do we see?

Dvir_aran_chart

Possibly one of the following two things, depending on what concern you have in your head.

1) The red line sits far above the other two lines, showing that unvaccinated people are much more likely to get infected.

2) The blue line diverges from the green line almost immediately after the 3rd shots started getting into arms, showing that the 3rd shot is super effective.

If you take another moment to look, you might start asking questions, as many in Twitter world did. Dvir was startlingly efficient at answering these queries.

A) Does the green line represent people with 2 or 3 doses, or is it strictly 2 doses? Aron asked this question and got the answer (the former):

AronBrand_israelcases_twoorthreedoses

It's time to check our presumptions. When you read that chart, did you presume it's exactly 2 doses or did you presume it's 2 or 3 doses? Or did you immediately spot the ambiguity? As I said in this article, graphs attain efficiency at communication because the designer leverages unspoken rules - the chart conveys certain information without explicitly placing it on the chart. But this can backfire. In this case, I presumed the three lines to display three non-overlapping groups of people, and thus the green line indicates those with 2 doses but not 3. That presumption led me to misinterpret what's on the chart.

B) What is the denominator of the case rates? Is it literal - by that I mean, all unvaccinated people for the red line, and all people with 3 doses for the blue line? Or is the denominator the population of Israel, the same number for all three lines? Lukas asked this question, and got the answer (the former).

Lukas_denominator

C) Since third shots are recommended for 60 year olds and over who were vaccinated at least 5 months ago, and most unvaccinated Israelis are below 60, this answer opens the possibility that the lines compare apples and oranges. Joe. S. asked about this, and received an answer (all lines display only 60 year olds and over.)

Joescholar_basepopulationquestion

Jason P. asked, and learned that the 5-month-out criterion is immaterial since 90% of the vaccinated have already reached that time point.

JasonPogue_5monthsout

D) We have even more presumptions. Like me, did you presume that the red line represents the "unvaccinated," meaning people who have not had any vaccine shots? If so, we may both be wrong about this. It has become the norm by vaccine researchers to lump "partially vaccinated" people with "unvaccinated", and call this combined group "unvaccinated". Here is an excerpt from a recent report from Public Health Ontario (link to PDF), which clearly states this unintuitive counting rule:

Ontario_case_definition

Notice that in this definition, someone who got infected within 14 days of the first shot is classified as an "unvaccinated" case and not a "partially vaccinated case".

In the following tweet, Dvir gave a hint of what he plotted:

Dvir_group_definition

In a previous analysis, he averaged the rates of people with 0 doses and 1 dose, which is equivalent to combining them and calling them unvaccinated. It's unclear to me what he did to the 1-dose subgroup in our featured chart - did it just vanish from the chart? (How people and cases are classified into these groups is a major factor in all vaccine effectiveness calculations - a topic I covered here. Unfortunately, most published reports do a poor job explaining what the analysts did).

E) Did you presume that all three lines are equally important? That's far from true. Since Israel is the world champion in vaccination, the bulk of the 60+ population form the green line. I asked Dvir and he responded that only 7.5%, or roughly 100K are unvaccinated.

DvirAran_proportionofunvaccinated

That means 1.2 million people are part of the green line, 12 times higher. There are roughly 50 cases per day among unvaccinated, and 370 daily cases among those with 2 or 3 doses. In other words, vaccinated people account for almost 90% of all cases.

Yes, this is inevitable when over 90% of the age group have been vaccinated (but it is predictable on the first day someone blasted everywhere that real-world VE is proved by the fact that almost all new cases were in the unvaccinated.)

If your job is to minimize infections, you should be spending most of your time thinking about the 370 cases among vaccinated than the 50 cases among unvaccinated. If you halve the case rate, that would be a difference of 185 cases vs 25. In Israel, the vaccination campaign has already succeeded; it's time to look forward, which is exactly why they are re-focusing on the already vaccinated.

***

If what you worry about most is the effectiveness of the original two-dose regimen, Dvir's chart raises a puzzle. Ignore the blue line, and remember that the green line already includes everybody represented by the blue line.

In the following chart, I removed the blue line, and added reference lines in dashed purple that correspond to 25%, 50% and 75% vaccine effectiveness. The data plotted on this chart are unadjusted case rates. A 75% effective vaccine cuts case rate by three quarters.

Junkcharts_dviraran_israel_threeshotschart

This chart shows the 2-dose mRNA vaccine was nowhere near 90% effective. (As regular readers know, I don't endorse this simplistic calculation and have outlined the problems here, but this style of calculation keeps getting published and passed around. Those who use it to claim real-world studies confirm prior clinical trial outcomes can either (a) insist on using it and retract their earlier conclusions, or (b) admit that such a calculation was, and is, a bad take.)

Also observe how the vaccinated (green) line is moving away from the unvaccinated (red) line. The vaccine apparently is becoming more effective, which runs counter to the trend used by the Israeli government to justify third doses. This improvement also precedes the start of the third-shot campaign. When the analytical method is bad, it generates all sorts of spurious findings.

***

As Dvir said, it is premature to comment on the third doses based on 10 days of data. For one thing, the vaccine developers insist that their vaccines must be given 14 days to work. In a typical calculation, all of the cases in the blue line fall outside the case-counting window. The effective number of cases that would be attributed to the 3-dose group right now is zero, and the vaccine effectiveness using the standard methodology is 100%, even better than shown in the chart.

There is an alternative interpretation of this graph. Statisticians call this the selection effect. On July 30, the blue line split out of the green: some people were selected to receive the 3rd dose - this includes an official selection (the government makes certain subgroups eligible) as well as a self-selection (within the eligible subgroup, certain people decide to get the 3rd shot earlier.) If those who are less exposed to the virus, or more risk averse, get the shots first, then all that is happening may be that we have split off a high VE subgroup from the green line. Even if the third shot were useless, the selection effect itself could explain the gap.

Statistics is about grays. It's not either-or. It's usually some of each. If you feel like Groundhog Day, you're getting the picture. When they rolled out two doses, we lived through an optimistic period in which most experts rejoiced about 90-100% real-world effectiveness, and then as more people get vaccinated, the effect washed away. The selection effect gradually disappears when vaccination becomes widespread. Are we starting a new cycle of hope and despair? We'll find out soon enough.


Hanging things on your charts

The Financial Times published the following chart that shows the rollout of vaccines in the U.K.

Ft_astrazeneca_uk_rollout

(I can't find the online link to the article. The article is titled "AstraZeneca and Oxford face setbacks and success as battle enters next phase", May 29/30 2021.)

This chart form is known as a "streamgraph", and it is a stacked area chart in disguise. 

The same trick can be applied to a column chart. See the "hanging" column chart below:

Junkcharts_hangingcolumns

The two charts show exactly the same data. The left one roots the columns at the bottom. The right one aligns the middle of the columns. 

I have rarely found these hanging charts useful. The realignment makes it harder to compare the sizes of the different column segments. On the normal stacked column chart, the yellow segments are the easiest to compare because they share the same base level. Even this is taken away from the reader on the right side.

Note also that the hanging version does not admit a vertical axis

The same comments apply to the streamgraph.

***

Nevertheless, I was surprised that the FT chart shown above actually works. The main message I learned was that initially U.K. primarily rolled out AstraZeneca and, to a lesser extent, Pfizer, shots while later, they introduced other vaccines, including Johnson & Johnson, Novavax, CureVac, Moderna, and "Other". 

I can also see that the supply of AstraZeneca has not changed much through the entire time window. Pfizer has grown to roughly the same scale as AstraZeneca. Moderna remains a small fraction of total shots. 

I can even roughly see that the total number of vaccinations has grown about six times from start to finish. 

That's quite a lot for one chart, so job well done!

There is one problem with the FT chart. It should have labelled end of May as "today". Half the chart is history, and the other half is the future.

***

For those following Covid-19 news, the FT chart is informative in a different way.

There is a misleading statement going around blaming the U.K.'s recent surge in cases on the Astrazeneca vaccine, claiming that the U.K. mostly uses AZ. This chart shows that from the start, about a third of the shots administered in the U.K. are Pfizer, and Pfizer's share has been growing over time. 

U.K. compared to some countries mostly using mRNA vaccines

Ourworldindata_cases

U.K. is almost back to the winter peak. That's because the U.K. is serious about counting cases. Look at the state of testing in these countries:

Ourworldindata_tests

What's clear about the U.S. case count is that it is kept low by cutting the number of tests by two-thirds, thus, our data now is once again severely biased towards severe cases. 

We can do a back-of-the-envelope calculation. The drop in testing may directly lead to a proportional drop in reported cases, thus removing 500 (asymptomatic, or mild) cases per million from the case count. The case count goes below 250 per million so the additional 200 or so reduction is due to other reasons such as vaccinations.


Start at zero improves this chart but only slightly

The following chart was forwarded to me recently:

Average_female_height

It's a good illustration of why the "start at zero" rule exists for column charts. The poor Indian lady looks extremely short in this women's club. Is the average Indian woman really half as tall as the average South African woman? (Surely not!)

Junkcharts_redo_womenheight_columnThe problem is only superficially fixed by starting the vertical axis at zero. Doing so highlights the fact that the difference in average heights is but a fraction of the average heights themselves. The intra-country differences are squashed in such a representation - which works against the primary goal of the data visualization itself.

Recall the Trifecta Checkup. At the top of the trifecta is the Question. The designer obviously wants to focus our attention on the difference of the averages. A column chart showing average heights fails the job!

This "proper" column chart sends the message that the difference in average heights is noise, unworthy of our attention. But this is a bad take of the underlying data. The range of average heights across countries isn't that wide, by virtue of large population sizes.

According to Wikipedia, they range from 4 feet 10.5 to 5 feet 6 (I'm ignoring several entries in the table based on non representative small samples.) How do we know that the difference of 2 inches between averages of South Africa and India is actually a sizable difference? The Wikipedia table has the average heights for most of the world's countries. There are perhaps 200 values. These values are sprinkled inside the range of about 8 inches top to bottom. If we divide the full range into 10 equal bins, that's roughly 0.8 inches per bin. So if we have two numbers that are 2 inches apart, they almost span 2 bins. If the data were evenly distributed, that's a huge shift.

(In reality, the data should be normally distributed, bell-shaped, with much more at the center than on the edges. That makes a difference of 2 inches even more significant if these are normal values near the center but less significant if these are extreme values on the tails. Stats students should be able to articulate why we are sure the data are normally distributed without having to plot the data.)

***

The original chart has further problems.

Another source of distortion comes from the scaling of the stick figures. The aspect ratio is being preserved, which means the area is being scaled. Given that the heights are scaled as per the data, the data are encoded twice, the second time in the widths. This means that the sizes of these figures grow at the rate of the square of the heights. (Contrast this with the scaling discussed in my earlier post this week which preserves the relative areas.)

At the end of that last post, I discuss why adding colors to a chart when the colors do not encode any data is a distraction to the reader. And this average height chart is an example.

From the Data corner of the Trifecta Checkup, I'm intrigued by the choice of countries. Why is Scotland highlighted instead of the U.K.? Why Latvia? According to Wikipedia, the Latvia estimate is based on a 1% sample of only 19 year olds.

Some of the data appear to be incorrect (or the designer used a different data source). Wikipedia lists the average height of Latvian women as 5 ft 6.5 while the chart shows 5 ft 5 in. Peru's average height of females is listed as 4 ft 11.5 and of males as 5 ft 4.5. The chart shows 5 ft 4 in.

***

Lest we think only amateurs make this type of chart, here is an example of a similar chart in a scientific research journal:

Fnhum-14-00338-g007

(link to original)

I have seen many versions of the above column charts with error bars, and the vertical axes not starting at zero. In every case, the heights (and areas) of these columns do not scale with the underlying data.

***

I tried a variant of the stem-and-leaf plot:

Junkcharts_redo_womenheight_stemleaf

The scale is chosen to reflect the full range of average heights given in Wikipedia. The chart works better with more countries to fill out the distribution. It shows India is on the short end of the scale but not quite the lowest. (As mentioned above, Peru actually should be placed close to the lower edge.)

 


Reading this chart won't take as long as withdrawing troops from Afghanistan

Art sent me the following Economist chart, noting how hard it is to understand. I took a look, and agreed. It's an example of a visual representation that takes more time to comprehend than the underlying data.

Econ_theendisnear

The chart presents responses to 3 questions on a survey. For each question, the choices are Approve, Disapprove, and "Neither" (just picking a word since I haven't seen the actual survey question). The overall approval/disapproval rates are presented, and then broken into two subgroups (Democrats and Republicans).

The first hurdle is reading the scale. Because the section from 75% to 100% has been removed, we are left with labels 0, 25, 50, 75, which do not say percentages unless we've consumed the title and subtitle. The Economist style guide places the units of data in the subtitle instead of on
the axis itself.

Our attention is drawn to the thick lines, which represent the differences between approval and disapproval rates. These differences are signed: it matters whether the proportion approving is higher or lower than the proportion disapproving. This means the data are encoded in the order of the dots plus the length of the line segment between them.

The two bottom rows of the Afghanistan question demonstrates this mental challenge. Our brains have to process the following visual cues:

1) the two lines are about the same lengths

2) the Republican dots are shifted to the right by a little

3) the colors of the dots are flipped

What do they all mean?

Econ_theendofforever_subset

A chart runs in trouble when you need a paragraph to explain how to read it.

It's sometimes alright to make complicated data visualization that illustrates complicated concepts. What justifies it is the payoff. I wrote about the concept of return on effort in data visualization here.

The payoff for this chart escaped me. Take the Democratic response to troop withdrawal. About 3/4 of Democrats approve while 15% disapprove. The thick line says 60% more Democrats approve than disapprove.

***

Here, I show the full axis, and add a 50% reference line

Junkcharts_redo_econ_theendofforever_1

Small edits but they help visualize "half of", "three quarters of".

***

Next, I switch to the more conventional stacked bars.

Junkcharts_redo_econ_theendofforever_stackedbars

This format reveals some of the hidden data on the chart - the proportion answering neither approve/disapprove, and neither yes/no.

On the stacked bars visual, the proportions are counted from both ends while in the dot plot above, the proportions are measured from the left end only.

***

Read all my posts about Economist charts here

 


The time has arrived for cumulative charts

Long-time reader Scott S. asked me about this Washington Post chart that shows the disappearance of pediatric flu deaths in the U.S. this season:

Washingtonpost_pediatricfludeaths

The dataset behind this chart is highly favorable to the designer, because the signal in the data is so strong. This is a good chart. The key point is shown clearly right at the top, with an informative title. Gridlines are very restrained. I'd draw attention to the horizontal axis. The master stroke here is omitting the week labels, which are likely confusing to all but the people familiar with this dataset.

Scott suggested using a line chart. I agree. And especially if we plot cumulative counts, rather than weekly deaths. Here's a quick sketch of such a chart:

Junkcharts_redo_wppedflu_panel

(On second thought, I'd remove the week numbers from the horizontal axis, and just go with the month labels. The Washington Post designer is right in realizing that those week numbers are meaningless to most readers.)

The vaccine trials have brought this cumulative count chart form to the mainstream. For anyone who have seen the vaccine efficacy charts, the interpretation of the panel of line charts should come naturally.

Instead of four plots, I prefer one plot with four superimposed lines. Like this:

Junkcharts_redo_wppeddeaths_superpose2

 

 

 


Vaccine researchers discard the start-at-zero rule

I struggled to decide on which blog to put this post. The reality is it bridges the graphical and analytical sides of me. But I ultimately placed it on the dataviz blog because that's where today's story starts.

Data visualization has few set-in-stone rules. If pressed for one, I'd likely cite the "start-at-zero" rule, which has featured regularly on Junk Charts (here, here, and here, for example). This rule only applies to a bar chart, where the heights (and thus, areas) of the bars should encode the data.

Here is a stacked column chart that earns boos from us:

Kfung_stackedcolumn_notstartingatzero_0

I made it so I'm downvoting myself. What's wrong with this chart? The vertical axis starts at 42 instead of zero. I've cropped out exactly 42 units from each column. Therefore, the column areas are no longer proportional to the ratio of the data. Forty-two is 84% of the column A while it is 19% of column B. By shifting the x-axis, I've made column B dwarf column A. For comparison, I added a second chart that has the x-axis start at zero.

Kfung_stackedcolumn_notstartatzero

On the right side, Column B is 22 times the height of column A. On the left side, it is 4 times as high. Both are really the same chart, except one has its legs chopped off.

***

Now, let me reveal the data behind the above chart. It is a re-imagination of the famous cumulative case curve from the Pfizer vaccine trial.

Pfizerfda_figure2_cumincidencecurves

I transferred the data to a stacked column chart. Each column block shows the incremental cases observed in a given week of the trial. All the blocks stacked together rise to the total number of cases observed by the time the interim analysis was presented to the FDA.

Observe that in the cumulative cases chart, the count starts at zero on Day 0 (first dose). This means the chart corresponds to the good stacked column chart, with the x-axis starting from zero on Day 0.

Kfung_pfizercumcases_stackedcolumn

The Pfizer chart above is, however, disconnected from the oft-chanted 95% vaccine efficacy number. You can't find this number on there. Yes, everyone has been lying to you. In a previous post, I did the math, and if you trace the vaccine efficacy throughout the trial, you end up at about 80% toward the right, not 95%.

Pfizer_cumcases_ve_vsc_published

How can they conclude VE is 95% but show a chart that never reaches that level? The chart was created for a "secondary" analysis included in the report for completeness. The FDA and researchers have long ago decided, before the trials started enrolling people, that they don't care about the cumulative case curve starting on Day 0. The "primary" analysis counts cases starting 7 days after the second shot, which means Day 29.

The first week that concerns the FDA is Days 29-35 (for Pfizer's vaccine). The vaccine arm saw 41 cases in the first 28 days of the trial. In effect, the experts chop the knees off the column chart. When they talk about 95% VE, they are looking at the column chart with the axis starting at 42.

Kfung_pfizercumcases_stackedcolumn_chopped

Yes, that deserves a boo.

***

It's actually even worse than that, if you could believe it.

The most commonly cited excuse for the knee-chop is that any vaccine is expected to be useless in the first X days (X being determined after the trial ends when they analyze the data). A recently published "real world" analysis of the situation in Israel contains a lengthy defense of this tactic, in which they state:

Strictly speaking, the vaccine effectiveness based on this risk ratio overestimates the overall vaccine effectiveness in our study because it does not include the early follow-up period during which the vaccine has no detectable effect (and thus during which the ratio is 1). [Appendix, Supplement 4]

Assuming VE = 0 prior to day X is equivalent to stipulating that the number of cases found in the vaccine arm is the same (within margin of error) as the number of cases in the placebo arm during the first X days.

That assumption is refuted by the Pfizer trial (and every other trial that has results so far.)

The Pfizer/Biontech vaccine was not useless during the first week. It's not 95% efficacious, more like 16%. In the second week, it improves to 33%, and so on. (See the VE curve I plotted above for the Pfizer trial.)

What happened was all the weeks before which the VE has not plateaued were dropped.

***

So I was simplifying the picture by chopping same-size blocks from both columns in the stacked column chart. Contrary to the no-effect assumption, the blocks at the bottom of each column are of different sizes. Much more was chopped from the placebo arm than from the vaccine arm.

You'd think that would unjustifiably favor the placebo. Not true! As almost all the cases on the vaccine arm were removed, the remaining cases on the placebo arm are now many multiples of those on the vaccine arm.

The following shows what the VE would have been reported if they had started counting cases from day X. The first chart counts all cases from first shot. The second chart removes the first two weeks of cases, corresponding to the analysis that other pharmas have done, namely, evaluate efficacy from 14 days after the first dose. The third chart removes even more cases, and represents what happens if the analysis is conducted from second dose. The fourth chart is the official Pfizer analysis, which began days after the second shot. Finally, the fifth chart shows analysis begining from 14 days after the second shot, the window selected by Moderna and Astrazeneca.

Kfung_howvaccinetrialsanalyzethedata

The premise that any vaccine is completely useless for a period after administration is refuted by the actual data. By starting analysis windows at some arbitrary time, the researchers make it unnecessarily difficult to compare trials. Selecting the time of analysis based on the results of a single trial is the kind of post-hoc analysis that statisticians have long warned leads to over-estimation. It's equivalent to making the vertical axis of a column chart start above zero in order to exaggerate the relative heights of the columns.

 

P.S. [3/1/2021] See comment below. I'm not suggesting vaccines are useless. They are still a miracle of science. I believe the desire to report a 90% VE number is counterproductive. I don't understand why a 70% or 80% effective vaccine is shameful. I really don't.


A note to science journal editors: require better visuals

In reviewing a new small-scale study of the Moderna vaccine, I found this chart:

Modernahalfdoses_fig3a

This style of charts is quite common in scientific papers. And they are horrible. It irks me to think that some authors are forced to adopt such styles.

The study's main goal is to compare two half doses to two full doses of the Moderna vaccine. (To understand the science, read the post on my book blog.) The participants were stratified by age group. The vaccine is expected to work better for younger people than for older people. The point of the study isn't to measure the difference by age group, and so the age-group dimension is secondary.

Upon recognizing that, I reduce the number of colors from 4 to 2:

Junkcharts_redo_modernahalfdoses_1

Halving the number of colors presents no additional difficulty. The reader spends less time cross-referencing.

The existence of the Pbo (placebo) and Conv (convalescent plasma) columns on the sides is both unsightly and suboptimal. The "Conv" serves as a reference level for the amount of antibodies the vaccine stimulates in people. A better way to display reference levels is using reference lines.

Junkcharts_redo_modernahalfdoses_2color

The biggest problem with the chart is the log scale on the vertical axis. This isn't even a log-10 but a log-2. (Each tick is a doubling of value.)

Take the first set of columns as an example. The second column is clearly less than twice the height of the first column, and yet 25 is 3.5 times bigger than 7.  The third column is also visually less than double the size of the second column, and yet 189 is 7.5 times bigger than 25. The areas (heights) of the columns do not convey the right information about relative sizes of the underlying data.

Here's an amusing observation. The brown area shaded below is half of the entire area of the chart - if we reverted it to a linear scale. And yet there is not a single data point above 250 in the data so the brown area is entirely empty.

Junkcharts_redo_modernahalfdoses_logscale

An effect of a log scale is to compress the larger values of a dataset. That's what you're seeing here.

I now revisualize using dotplots:

Junkcharts_redo_modernahalfdoses_dotplotlinear

The version on the left retains the log scale while the right one (pun intended) reverts to the linear scale.

The biggest effect by far is the spike of antibodies between day 29 and 43 - which is after the second shot is administered. (For Moderna, the second shot is targeted for day 28.) In fact, it is during that window that the level of antibodies went from below the "conv" level (i.e. from natural infection) to far above.

The log-scale version buries this finding because it squeezes the large numbers on the chart. In addition, it artificially pulls the small numbers toward the "Conv" level. On the right chart, the second dot for 18-54, full doses is only at half the level of "Conv"  but it looks tantalizing close to the "Conv" level on the left chart.

The authors of the study also claim that there is negligible dropoff by 30 days after the second dose, i.e. between the third and fourth dots in each set. That may be so on the log-scale chart but on the linear chart, we see a moderate reduction. I don't believe the size of this study allows us to make a stronger conclusion but the claim of no dropoff is dubious.

The left chart also obscures the age-group differences. It appears as if all four sets show roughly the same pattern. With the linear scale, we notice that the vaccine clearly works better for the younger subgroup. As I discussed on the book blog, no one actually knows what level of antibodies constitutes "protection," and so I can't say whether that age-group difference has practical significance.

***

I recommend using log scales sparingly and carefully. They are a source of much mischief and misadventure.

 

 

 


These are the top posts of 2020

It's always very interesting as a writer to look back at a year's of posts and find out which ones were most popular with my readers.

Here are the top posts on Junk Charts from 2020:

How to read this chart about coronavirus risk

This post about a New York Times scatter plot dates from February, a time when many Americans were debating whether Covid-19 was just the flu.

Proportions and rates: we are no dupes

This post about a ArsTechnica chart on the effects of Covid-19 by age is an example of designing the visual to reflect the structure of the data.

When the pie chart is more complex than the data

This post shows a 3D pie chart which is worse than a 2D pie chart.

Twitter people upset with that Covid symptoms diagram

This post discusses some complicated graphics designed to illustrate complicated datasets on Covid-19 symptoms.

Cornell must remove the logs before it reopens in the fall

This post is another warning to think twice before you use log scales.

What is the price of objectivity?

This post turns an "objective" data visualization into a piece of visual story-telling.

The snake pit chart is the best election graphic ever

This post introduces my favorite U.S. presidential election graphic, designed by the FiveThirtyEight team.

***

Here is a list of posts that deserve more attention:

Locating the political center

An example of bringing readers as close to the insights as possible

Visualizing change over time

An example of designing data visualization to reflect the structure of multivariate data

Bloomberg made me digest these graphics slowly

An example of simple and thoughtful graphics

The hidden bad assumption behind most dual-axis time-series charts

Read this before you make a dual-axis chart

Pie chart conventions

Read this before you make a pie chart

***
Looking forward to bring you more content in 2021!

Happy new year.