Check your presumptions while you're reading this chart about Israel's vaccination campaign

On July 30, Israel began administering third doses of mRNA vaccines to targeted groups of people. This decision was controversial since there is no science to support it. The policymakers do have educated guesses by experts based on best-available information. By science, I mean actual evidence. Since no one has previously been given three shots, there can be no data on which anyone can root such a decision. Nevertheless, the pandemic does not always give us time to collect relevant data, and so speculative analysis has found its calling.

Dvir Aran, at Technion, has been diligently tracking the situation in Israel on his Twitter. Ten days after July 30, he posted the following chart, which immediately led many commentators to bounce out of their seats crowning the third shot as a magic bullet. Notably, Dvir himself did not endorse such a claim. (See here to learn how other hasty conclusions by experts have fared.)

When you look at Dvir's chart, what do we see?

Dvir_aran_chart

Possibly one of the following two things, depending on what concern you have in your head.

1) The red line sits far above the other two lines, showing that unvaccinated people are much more likely to get infected.

2) The blue line diverges from the green line almost immediately after the 3rd shots started getting into arms, showing that the 3rd shot is super effective.

If you take another moment to look, you might start asking questions, as many in Twitter world did. Dvir was startlingly efficient at answering these queries.

A) Does the green line represent people with 2 or 3 doses, or is it strictly 2 doses? Aron asked this question and got the answer (the former):

AronBrand_israelcases_twoorthreedoses

It's time to check our presumptions. When you read that chart, did you presume it's exactly 2 doses or did you presume it's 2 or 3 doses? Or did you immediately spot the ambiguity? As I said in this article, graphs attain efficiency at communication because the designer leverages unspoken rules - the chart conveys certain information without explicitly placing it on the chart. But this can backfire. In this case, I presumed the three lines to display three non-overlapping groups of people, and thus the green line indicates those with 2 doses but not 3. That presumption led me to misinterpret what's on the chart.

B) What is the denominator of the case rates? Is it literal - by that I mean, all unvaccinated people for the red line, and all people with 3 doses for the blue line? Or is the denominator the population of Israel, the same number for all three lines? Lukas asked this question, and got the answer (the former).

Lukas_denominator

C) Since third shots are recommended for 60 year olds and over who were vaccinated at least 5 months ago, and most unvaccinated Israelis are below 60, this answer opens the possibility that the lines compare apples and oranges. Joe. S. asked about this, and received an answer (all lines display only 60 year olds and over.)

Joescholar_basepopulationquestion

Jason P. asked, and learned that the 5-month-out criterion is immaterial since 90% of the vaccinated have already reached that time point.

JasonPogue_5monthsout

D) We have even more presumptions. Like me, did you presume that the red line represents the "unvaccinated," meaning people who have not had any vaccine shots? If so, we may both be wrong about this. It has become the norm by vaccine researchers to lump "partially vaccinated" people with "unvaccinated", and call this combined group "unvaccinated". Here is an excerpt from a recent report from Public Health Ontario (link to PDF), which clearly states this unintuitive counting rule:

Ontario_case_definition

Notice that in this definition, someone who got infected within 14 days of the first shot is classified as an "unvaccinated" case and not a "partially vaccinated case".

In the following tweet, Dvir gave a hint of what he plotted:

Dvir_group_definition

In a previous analysis, he averaged the rates of people with 0 doses and 1 dose, which is equivalent to combining them and calling them unvaccinated. It's unclear to me what he did to the 1-dose subgroup in our featured chart - did it just vanish from the chart? (How people and cases are classified into these groups is a major factor in all vaccine effectiveness calculations - a topic I covered here. Unfortunately, most published reports do a poor job explaining what the analysts did).

E) Did you presume that all three lines are equally important? That's far from true. Since Israel is the world champion in vaccination, the bulk of the 60+ population form the green line. I asked Dvir and he responded that only 7.5%, or roughly 100K are unvaccinated.

DvirAran_proportionofunvaccinated

That means 1.2 million people are part of the green line, 12 times higher. There are roughly 50 cases per day among unvaccinated, and 370 daily cases among those with 2 or 3 doses. In other words, vaccinated people account for almost 90% of all cases.

Yes, this is inevitable when over 90% of the age group have been vaccinated (but it is predictable on the first day someone blasted everywhere that real-world VE is proved by the fact that almost all new cases were in the unvaccinated.)

If your job is to minimize infections, you should be spending most of your time thinking about the 370 cases among vaccinated than the 50 cases among unvaccinated. If you halve the case rate, that would be a difference of 185 cases vs 25. In Israel, the vaccination campaign has already succeeded; it's time to look forward, which is exactly why they are re-focusing on the already vaccinated.

***

If what you worry about most is the effectiveness of the original two-dose regimen, Dvir's chart raises a puzzle. Ignore the blue line, and remember that the green line already includes everybody represented by the blue line.

In the following chart, I removed the blue line, and added reference lines in dashed purple that correspond to 25%, 50% and 75% vaccine effectiveness. The data plotted on this chart are unadjusted case rates. A 75% effective vaccine cuts case rate by three quarters.

Junkcharts_dviraran_israel_threeshotschart

This chart shows the 2-dose mRNA vaccine was nowhere near 90% effective. (As regular readers know, I don't endorse this simplistic calculation and have outlined the problems here, but this style of calculation keeps getting published and passed around. Those who use it to claim real-world studies confirm prior clinical trial outcomes can either (a) insist on using it and retract their earlier conclusions, or (b) admit that such a calculation was, and is, a bad take.)

Also observe how the vaccinated (green) line is moving away from the unvaccinated (red) line. The vaccine apparently is becoming more effective, which runs counter to the trend used by the Israeli government to justify third doses. This improvement also precedes the start of the third-shot campaign. When the analytical method is bad, it generates all sorts of spurious findings.

***

As Dvir said, it is premature to comment on the third doses based on 10 days of data. For one thing, the vaccine developers insist that their vaccines must be given 14 days to work. In a typical calculation, all of the cases in the blue line fall outside the case-counting window. The effective number of cases that would be attributed to the 3-dose group right now is zero, and the vaccine effectiveness using the standard methodology is 100%, even better than shown in the chart.

There is an alternative interpretation of this graph. Statisticians call this the selection effect. On July 30, the blue line split out of the green: some people were selected to receive the 3rd dose - this includes an official selection (the government makes certain subgroups eligible) as well as a self-selection (within the eligible subgroup, certain people decide to get the 3rd shot earlier.) If those who are less exposed to the virus, or more risk averse, get the shots first, then all that is happening may be that we have split off a high VE subgroup from the green line. Even if the third shot were useless, the selection effect itself could explain the gap.

Statistics is about grays. It's not either-or. It's usually some of each. If you feel like Groundhog Day, you're getting the picture. When they rolled out two doses, we lived through an optimistic period in which most experts rejoiced about 90-100% real-world effectiveness, and then as more people get vaccinated, the effect washed away. The selection effect gradually disappears when vaccination becomes widespread. Are we starting a new cycle of hope and despair? We'll find out soon enough.


Ranking data provide context but can also confuse

This dataviz from the Economist had me spending a lot of time clicking around - which means it is a success.

Econ_usaexcept_hispanic

The graphic presents four measures of wellbeing in society - life expectancy, infant mortality rate, murder rate and prison population. The primary goal is to compare nations across those metrics. The focus is on comparing how certain nations (or subgroups) rank against each other, as indicated by the relative vertical position.

The Economist staff has a particular story to tell about racial division in the US. The dotted bars represent the U.S. average. The colored bars are the averages for Hispanic, white and black Americans. The wider the gap between the colored bars, the more variant is the experiences between American races.

The chart shows that the racial gap of life expectancy is the widest. For prison population, the U.S. and its racial subgroups occupy many of the lowest (i.e. least desirable) ranks, with the smallest gap in ranking.

***

The primary element of interactivity is hovering on a bar, which then highlights the four bars corresponding to the particular nation selected. Here is the picture for Thailand:

Econ_usaexcept_thailand

According to this view of the world, Thailand is a close cousin of the U.S. On each metric, the Thai value clings pretty near the U.S. average and sits within the range by racial groups. I'm surprised to learn that the prison population in Thailand is among the highest in the world.

Unfortunately, this chart form doesn't facilitate comparing Thailand to a country other than the U.S as one can highlight only one country at a time.

***

While the main focus of the chart is on relative comparison through ranking, the reader can extract absolute difference by reading the lengths of the bars.

This is a close-up of the bottom of the prison population metric:

Econ_useexcept_prisonpop_bottomThe length of each bar displays the numeric data. The red line is an outlier in this dataset. Black Americans suffer an incarceration rate that is almost three times the national average. Even white Americans (blue line) is imprisoned at a rate higher than most countries around the world.

As noted above, the prison population metric exhibits the smallest gap between racial subgroups. This chart is a great example of why ranking data frequently hide important information. The small gap in ranking masks the extraordinary absolute difference in incareration rates between white and black America.

The difference between rank #1 and rank #2 is enormous.

Econ_useexcept_lifeexpect_topThe opposite situation appears for life expectancy. The life expectancy values are bunched up especially at the top of the scale. The absolute difference between Hispanic and black America is 82 - 75 = 7 years, which looks small because the axis starts at zero. On a ranking scale, Hispanic is roughly in the top 15% while black America is just above the median. The relative difference is huge.

For life expectancy, ranking conveys the view that even a 7-year difference is a big deal because the countries are tightly bunched together. For prison population, ranking shows the view that a multiple fold difference is "unimportant" because a 20-0 blowout and a 10-0 blowout are both heavy defeats.

***

Whenever you transform numeric data to ranks, remember that you are artificially treating the gap between each value and the next value as a constant, even when the underlying numeric gaps show wide variance.

 

 

 

 

 


Probabilities and proportions: which one is the chart showing

The New York Times showed this chart (link):

Nyt_unvaccinated_undeterred

My first read: oh my gosh, 40-50% of the unvaccinated Americans are living their normal lives - dining at restaurants, assembling with more than 10 people, going to religious gatherings.

After reading the text around this chart, I realize I have misinterpreted it.

The chart should be read by columns. Each column is a "pie chart". For example, the first column shows that half the restaurant diners are not vaccinated, a third are fully vaccinated, and the remainder are partially vaccinated. The other columns have roughly the same proportions.

The author says "The rates of vaccination among people doing these activities largely reflect the rates in the population." This line is perhaps more confusing than intended. What she's saying is that in the general population, half of us are unvaccinated, a third are fully unvaccinated, and the remainder are partially vaccinated.

Here's a picture:

Junkcharts_redo_nyt_unvaccinatedundeterred

What this chart is saying is that the people dining out is like a random sample from all Americans. So too the other groups depicted. What Americans are choosing to do is independent of their vaccination status.

Unvaccinated people are no less likely to be doing all these activities than the fully vaccinated. This raises the question: are half of the people not wearing masks outdoors unvaccinated?

***

Why did I read the chart wrongly in the first place? It has to do with expectations.

Most survey charts plot probabilities not proportions. I haphazardly grabbed the following Pew Research chart as an example:

Pew_kids_socialmedia

From this chart, we learn that 30% of kids 9-11 years old uses TikTok compared to 11% of kids 5-8.  The percentages down a column do not sum to 100%.

 


Vaccine researchers discard the start-at-zero rule

I struggled to decide on which blog to put this post. The reality is it bridges the graphical and analytical sides of me. But I ultimately placed it on the dataviz blog because that's where today's story starts.

Data visualization has few set-in-stone rules. If pressed for one, I'd likely cite the "start-at-zero" rule, which has featured regularly on Junk Charts (here, here, and here, for example). This rule only applies to a bar chart, where the heights (and thus, areas) of the bars should encode the data.

Here is a stacked column chart that earns boos from us:

Kfung_stackedcolumn_notstartingatzero_0

I made it so I'm downvoting myself. What's wrong with this chart? The vertical axis starts at 42 instead of zero. I've cropped out exactly 42 units from each column. Therefore, the column areas are no longer proportional to the ratio of the data. Forty-two is 84% of the column A while it is 19% of column B. By shifting the x-axis, I've made column B dwarf column A. For comparison, I added a second chart that has the x-axis start at zero.

Kfung_stackedcolumn_notstartatzero

On the right side, Column B is 22 times the height of column A. On the left side, it is 4 times as high. Both are really the same chart, except one has its legs chopped off.

***

Now, let me reveal the data behind the above chart. It is a re-imagination of the famous cumulative case curve from the Pfizer vaccine trial.

Pfizerfda_figure2_cumincidencecurves

I transferred the data to a stacked column chart. Each column block shows the incremental cases observed in a given week of the trial. All the blocks stacked together rise to the total number of cases observed by the time the interim analysis was presented to the FDA.

Observe that in the cumulative cases chart, the count starts at zero on Day 0 (first dose). This means the chart corresponds to the good stacked column chart, with the x-axis starting from zero on Day 0.

Kfung_pfizercumcases_stackedcolumn

The Pfizer chart above is, however, disconnected from the oft-chanted 95% vaccine efficacy number. You can't find this number on there. Yes, everyone has been lying to you. In a previous post, I did the math, and if you trace the vaccine efficacy throughout the trial, you end up at about 80% toward the right, not 95%.

Pfizer_cumcases_ve_vsc_published

How can they conclude VE is 95% but show a chart that never reaches that level? The chart was created for a "secondary" analysis included in the report for completeness. The FDA and researchers have long ago decided, before the trials started enrolling people, that they don't care about the cumulative case curve starting on Day 0. The "primary" analysis counts cases starting 7 days after the second shot, which means Day 29.

The first week that concerns the FDA is Days 29-35 (for Pfizer's vaccine). The vaccine arm saw 41 cases in the first 28 days of the trial. In effect, the experts chop the knees off the column chart. When they talk about 95% VE, they are looking at the column chart with the axis starting at 42.

Kfung_pfizercumcases_stackedcolumn_chopped

Yes, that deserves a boo.

***

It's actually even worse than that, if you could believe it.

The most commonly cited excuse for the knee-chop is that any vaccine is expected to be useless in the first X days (X being determined after the trial ends when they analyze the data). A recently published "real world" analysis of the situation in Israel contains a lengthy defense of this tactic, in which they state:

Strictly speaking, the vaccine effectiveness based on this risk ratio overestimates the overall vaccine effectiveness in our study because it does not include the early follow-up period during which the vaccine has no detectable effect (and thus during which the ratio is 1). [Appendix, Supplement 4]

Assuming VE = 0 prior to day X is equivalent to stipulating that the number of cases found in the vaccine arm is the same (within margin of error) as the number of cases in the placebo arm during the first X days.

That assumption is refuted by the Pfizer trial (and every other trial that has results so far.)

The Pfizer/Biontech vaccine was not useless during the first week. It's not 95% efficacious, more like 16%. In the second week, it improves to 33%, and so on. (See the VE curve I plotted above for the Pfizer trial.)

What happened was all the weeks before which the VE has not plateaued were dropped.

***

So I was simplifying the picture by chopping same-size blocks from both columns in the stacked column chart. Contrary to the no-effect assumption, the blocks at the bottom of each column are of different sizes. Much more was chopped from the placebo arm than from the vaccine arm.

You'd think that would unjustifiably favor the placebo. Not true! As almost all the cases on the vaccine arm were removed, the remaining cases on the placebo arm are now many multiples of those on the vaccine arm.

The following shows what the VE would have been reported if they had started counting cases from day X. The first chart counts all cases from first shot. The second chart removes the first two weeks of cases, corresponding to the analysis that other pharmas have done, namely, evaluate efficacy from 14 days after the first dose. The third chart removes even more cases, and represents what happens if the analysis is conducted from second dose. The fourth chart is the official Pfizer analysis, which began days after the second shot. Finally, the fifth chart shows analysis begining from 14 days after the second shot, the window selected by Moderna and Astrazeneca.

Kfung_howvaccinetrialsanalyzethedata

The premise that any vaccine is completely useless for a period after administration is refuted by the actual data. By starting analysis windows at some arbitrary time, the researchers make it unnecessarily difficult to compare trials. Selecting the time of analysis based on the results of a single trial is the kind of post-hoc analysis that statisticians have long warned leads to over-estimation. It's equivalent to making the vertical axis of a column chart start above zero in order to exaggerate the relative heights of the columns.

 

P.S. [3/1/2021] See comment below. I'm not suggesting vaccines are useless. They are still a miracle of science. I believe the desire to report a 90% VE number is counterproductive. I don't understand why a 70% or 80% effective vaccine is shameful. I really don't.


A beautiful curve and its deadly misinterpretation

When the preliminary analyses of their Phase 3 trials came out , vaccine developers pleased their audience of scientists with the following data graphic:

Pfizerfda_cumcases

The above was lifted out of the FDA briefing document for the Pfizer / Biontech vaccine.

Some commentators have honed in on the blue line for the vaccinated arm of the Pfizer trial.

Junkcharts_pfizerfda_redo_vaccinecases

Since the vertical axis shows cumulative number of cases, it is noted that the vaccine reached peak efficacy after 14 days following the first dose. The second dose was administered around Day 21. At this point, the vaccine curve appeared almost flat. Thus, these commentators argued, we should make a big bet on the first dose.

***

The chart is indeed very beautiful. It's rare to see such a huge gap between the test group and the control group. Notice that I just described the gap between test and control. That's what a statistician is looking at in that chart - not the blue line, but the gap between the red and blue lines.

Imagine: if the curve for the placebo group looked the same as that for the vaccinated group, then the chart would lose all its luster. Screams of victory would be replaced by tears of sadness.

Here I bring back both lines, and you should focus on the gaps between the lines:

Junkcharts_pfizerfda_redo_twocumcases

Does the action stop around day 14? The answer is a resounding No! In fact, the red line keeps rising so over time, the vaccine's efficacy improves (since VE is a ratio between the two groups).

The following shows the vaccine efficacy curve:

Junkcharts_pfizerfda_redo_ve

Right before the second dose, VE is just below 50%. VE keeps rising and reaches 70% by day 50, which is about a month after the second dose.

If the FDA briefing document has shown the VE curve, instead of the cumulative-cases curve, few would argue that you don't need the second dose!

***

What went wrong here? How come the beautiful chart may turn out to be lethal? (See this post on my book blog for reasons why I think foregoing or delaying the second dose will exacerbate the pandemic.)

It's a bit of bait and switch. The original chart plots cumulative case counts, separately for each treatment group. Cumulative case counts are inputs to computing vaccine efficacy. It is true that as the blue line for the vaccine flattens, VE would likely rise. But the case count for the vaccine group is an imperfect proxy for VE. As I showed above, the VE continues to gain strength long after the vaccine case count has levelled.

The important lesson for data visualization designers is: plot the metric that matters to decision-makers; avoid imperfect proxies.

 

P.S. [1/19/2021: For those who wants to get behind the math of all this, the following several posts on my book blog will help.

One-dose Pfizer is not happening, and here's why

The case for one-dose vaccines is lacking key details

One-dose vaccine strategy elevates PR over science

]

[1/21/2021: The Guardian chimes in with "Single Covid vaccine dose in Israel 'less effective than we thought'" (link). "In remarks reported by Army Radio, Nachman Ash said a single dose appeared “less effective than we had thought”, and also lower than Pfizer had suggested." To their credit, Pfizer has never publicly recommended a one-dose treatment.]

[1/21/2021: For people in marketing or business, I wrote up a new post that expresses the one-dose vs two-dose problem in terms of optimizing an email drip campaign. It boils down to: do you accept that argument that you should get rid of your latter touches because the first email did all the work? Or do you want to run an experiment with just one email before you decide? You can read this on the book blog here.]


Is this an example of good or bad dataviz?

This chart is giving me feelings:

Trump_mcconnell_chart

I first saw it on TV and then a reader submitted it.

Let's apply a Trifecta Checkup to the chart.

Starting at the Q corner, I can say the question it's addressing is clear and relevant. It's the relationship between Trump and McConnell's re-election. The designer's intended message comes through strongly - the chart offers evidence that McConnell owes his re-election to Trump.

Visually, the graphic has elements of great story-telling. It presents a simple (others might say, simplistic) view of the data - just the poll results of McConnell vs McGrath at various times, and the election result. It then flags key events, drawing the reader's attention to those. These events are selected based on key points on the timeline.

The chart includes wise design choices, such as no gridlines, infusing the legend into the chart title, no decimals (except for last pair of numbers, the intention of which I'm not getting), and leading with the key message.

I can nitpick a few things. Get rid of the vertical axis. Also, expand the scale so that the difference between 51%-40% and 58%-38% becomes more apparent. Space the time points in proportion to the dates. The box at the bottom is a confusing afterthought that reduces rather than assists the messaging.

But the designer got the key things right. The above suggestions do not alter the reader's expereince that much. It's a nice piece of visual story-telling, and from what I can see, has made a strong impact with the audience it is intended to influence.

_trifectacheckup_junkchartsThis chart is proof why the Trifecta Checkup has three corners, plus linkages between them. If we just evaluate what the visual is conveying, this chart is clearly above average.

***

In the D corner, we ask: what the Data are saying?

This is where the chart runs into several problems. Let's focus on the last two sets of numbers: 51%-40% and 58%-38%. Just add those numbers and do you notice something?

The last poll sums to 91%. This means that up to 10% of the likely voters responded "not sure" or some other candidate. If these "shy" voters show up at the polls as predicted by the pollsters, and if they voted just like the not shy voters, then the election result would have been 56%-44%, not 51%-40%. So, the 58%-38% result is within the margin of error of these polls. (If the "shy" voters break for McConnell in a 75%-25% split, then he gets 58% of the total votes.)

So, the data behind the line chart aren't suggesting that the election outcome is anomalous. This presents a problem with the Q-D and D-V green arrows as these pairs are not in sync.

***

In the D corner, we should consider the totality of the data available to the designer, not just what the designer chooses to utilize. The pivot of the chart is the flag annotating the "Trump robocall."

Here are some questions I'd ask the designer:

What else happened on October 31 in Kentucky?

What else happened on October 31, elsewhere in the country?

Was Trump featured in any other robocalls during the period portrayed?

How many robocalls were made by the campaign, and what other celebrities were featured?

Did any other campaign event or effort happen between the Trump robocall and election day?

Is there evidence that nothing else that happened after the robocall produced any value?

The chart commits the XYopia (i.e. X-Y myopia) fallacy of causal analysis. When the data analyst presents one cause and one effect, we are cued to think the cause explains the effect but in every scenario that is not a designed experiment, there are multiple causes at play. Sometimes, the more influential cause isn't the one shown in the chart.

***

Finally, let's draw out the connection between the last set of poll numbers and the election results. This shows why causal inference in observational data is such a beast.

Poll numbers are about a small number of people (500-1,000 in the case of Kentucky polls) who respond to polling. Election results are based on voters (> 2 million). An assumption made by the designer is that these polls are properly conducted, and their results are credible.

The chart above makes the claim that Trump's robocall gave McConnell 7% more votes than expected. This implies the robocall influenced at least 140,000 voters. Each such voter must fit the following criteria:

  • Was targeted by the Trump robocall
  • Was reached by the Trump robocall (phone was on, etc.)
  • Responded to the Trump robocall, by either picking up the phone or listening to the voice recording or dialing a call-back number
  • Did not previously intend to vote for McConnell
  • If reached by a pollster, would refuse to respond, or say not sure, or voting for McGrath or a third candidate
  • Had no other reason to change his/her behavior

Just take the first bullet for example. If we found a voter who switched to McConnell after October 31, and if this person was not on the robocall list, then this voter contributes to the unexpected gain in McConnell votes but weakens the case that the robocall influenced the election.

As analysts, our job is to find data to investigate all of the above. Some of these are easier to investigate. The campaign knows, for example, how many people were on the target list, and how many listened to the voice recording.

 

 

 

 


Podcast highlights

Recently, I made a podcast for Ryan Ray, which you can access here. The link sends you to a 14-day free trial to his newsletter, which is where he publishes his podcasts.

Kaiserfung_warroommedia

Ryan contacted me after he read my book Numbers Rule Your World (link). I was happy to learn that he enjoyed the stories, and during the podcast, he gave an example of how he applied the statistical concepts to other situations.

During the podcast, you will hear:

  • I have a line in my course syllabus that reads "after you take this class, you will not be able to look at numbers (in the media) with a straight face ever again." That's a goal of mine. And it also applies to my books.

  • Why are most statisticians skeptics

  • Figuring out the statistical conclusions is the easy part while the hardest challenge is to find a way to communicate them to a non-technical audience. I went through many drafts before I landed on the precise language used in those stories.

  • Why "correlation is not causation" is not useful practical advice
  • You can't unsee something you've already seen, and this creates hindsight bias
  • The biggest bang for the buck when improving statistical models is improving data quality

  • Some models, such as polls and election forecasts, can be thought of as thermometers measuring the mood of the respondents at the time of polling.

***

To hear the podcast, visit Ryan Ray's website.


Election visuals: three views of FiveThirtyEight's probabilistic forecasts

As anyone who is familiar with Nate Silver's forecasting of U.S. presidential elections knows, he runs a simulation that explores the space of possible scenarios. The polls that provide a baseline forecast make certain assumptions, such as who's a likely voter. Nate's model unshackles these assumptions from the polling data, exploring how the outcomes vary as these assumptions shift.

In the most recent simulation, his computer explores 40,000 scenarios, each of which predicts a split of the electoral vote, from which the winner of the election can be determined. The model's outcome is usually summarized by a winning probability, which is just the proportion of scenarios under which one candidate wins.

This type of forecasting was responsible for the infamous meltdown in 2016 when most of these models - Nate's being an exception - issued extremely confident predictions that Hillary Clinton wins with 95% or higher probability. Essentially, the probability distribution collapses to a point. This is analogous to an extremely narrow confidence band, indicating almost zero uncertainty about the event. It was as if almost all of the 40,000 scenarios predicted Clinton to be the winner.

The 538 data team has come up with various ways of visualizing the outputs of the model (link). The entire post is worth reading. Here, I'll highlight the most scientific, and direct visual representation, which is the third display.

538_pdf_pair

We start by looking at the bottom of the two charts, showing the predicted electoral votes won  by Democratic challenger Joe Biden, in each of the 40,000 scenarios. Our attention is directed to the thick line that gives the relative chance of Biden's electoral-vote tally. This line is a smoothed summary of the columns in the background, which show the number of times the simulation produces each electoral-vote count.

The highlighted, right side of the chart recounts scenarios in which Biden becomes President, that is to say, he wins more than 270 electoral votes (out of 538, doh). The faded, left side represents scenarios in which Biden is defeated and Trump wins a second term.

The reason I focused on the bottom chart is that the top chart is merely a mirror image of this one. Just reflect the bottom chart around the vertical axis of 270 electoral votes, change the color scheme to red, and swap annotations related to Trump and Biden, and you get the other chart. This is because the narrative has excluded third-party and write-in candidates, leaving us with a zero-sum situation.

Alternatively, one can jam both charts into one, while supplying extra labels, like this:

Redo_junkcharts_538forecastpdf_1

I prefer the denser single chart because my mind wanders away searching for extra meaning when chart elements are mirrored.

One advantage of the mirrored presentation is that the probability profiles of the potential Trump or Biden wins can be directly compared. We learn that Trump's winning margins are smaller, rarely above 150, and never above 250.

This comparison is made easier by flipping left side of the chart onto the right side:

Redo_junkcharts_538forecastpdf_2

Those are three different visualizations using the same chart form. I'd have to run a poll to figure out which is the best. What's your opinion?


This chart shows why the PR agency for the UK government deserves a Covid-19 bonus

The Economist illustrated some interesting consumer research with this chart (link):

Economist_covidpoll

The survey by Dalia Research asked people about the satisfaction with their country's response to the coronavirus crisis. The results are reduced to the "Top 2 Boxes", the proportion of people who rated their government response as "very well" or "somewhat well".

This dimension is laid out along the horizontal axis. The chart is a combo dot and bubble chart, arranged in rows by region of the world. Now what does the bubble size indicate?

It took me a while to find the legend as I was expecting it either in the header or the footer of the graphic. A larger bubble depicts a higher cumulative number of deaths up to June 15, 2020.

The key issue is the correlation between a country's death count and the people's evaluation of the government response.

Bivariate correlation is typically shown on a scatter plot. The following chart sets out the scatter plots in a small multiples format with each panel displaying a region of the world.

Redo_economistcovidpolling_scatter

The death tolls in the Asian countries are low relative to the other regions, and yet the people's ratings vary widely. In particular, the Japanese people are pretty hard on their government.

In Europe, the people of Greece, Netherlands and Germany think highly of their government responses, which have suppressed deaths. The French, Spaniards and Italians are understandably unhappy. The British appears to be the most forgiving of their government, despite suffering a higher death toll than France, Spain or Italy. This speaks well of their PR operation.

Cumulative deaths should be adjusted by population size for a proper comparison across nations. When the same graphic is produced using deaths per million (shown on the right below), the general story is preserved while the pattern is clarified:

Redo_economistcovidpolling_deathspermillion_2

The right chart shows deaths per million while the left chart shows total deaths.

***

In the original Economist chart, what catches our attention first is the bubble size. Eventually, we notice the horizontal positioning of these bubbles. But the star of this chart ought to be the new survey data. I swapped those variables and obtained the following graphic:

Redo_economistcovidpolling_swappedvar

Instead of using bubble size, I switched to using color to illustrate the deaths-per-million metric. If ratings of the pandemic response correlate tightly with deaths per million, then we expect the color of these dots to evolve from blue on the left side to red on the right side.

The peculiar loss of correlation in the U.K. stands out. Their PR firm deserves a bonus!


Designs of two variables: map, dot plot, line chart, table

The New York Times found evidence that the richest segments of New Yorkers, presumably those with second or multiple homes, have exited the Big Apple during the early months of the pandemic. The article (link) is amply assisted by a variety of data graphics.

The first few charts represent different attempts to express the headline message. Their appearance in the same article allows us to assess the relative merits of different chart forms.

First up is the always-popular map.

Nytimes_newyorkersleft_overallmap

The advantage of a map is its ease of comprehension. We can immediately see which neighborhoods experienced the greater exoduses. Clearly, Manhattan has cleared out a lot more than outer boroughs.

The limitation of the map is also in view. With the color gradient dedicated to the proportions of residents gone on May 1st, there isn't room to express which neighborhoods are richer. We have to rely on outside knowledge to make the correlation ourselves.

The second attempt is a dot plot.

Nytimes_newyorksleft_percentathome

We may have to take a moment to digest the horizontal axis. It's not time moving left to right but income percentiles. The poorest neighborhoods are to the left and the richest to the right. I'm assuming that these percentiles describe the distribution of median incomes in neighborhoods. Typically, when we see income percentiles, they are based on households, regardless of neighborhoods. (The former are equal-sized segments, unlike the latter.)

This data graphic has the reverse features of the map. It does a great job correlating the drop in proportion of residents at home with the income distribution but it does not convey any spatial information. The message is clear: The residents in the top 10% of New York neighborhoods are much more likely to have left town.

In the following chart, I attempted a different labeling of both axes. It cuts out the need for readers to reverse being home to not being home, and 90th percentile to top 10%.

Redo_nyt_newyorkerslefttown

The third attempt to convey the income--exit relationship is the most successful in my mind. This is a line chart, with time on the horizontal axis.

Nyt_newyorkersleft_percenthomebyincome

The addition of lines relegates the dots to the background. The lines show the trend more clearly. If directly translated from the dot plot, this line chart should have 100 lines, one for each percentile. However, the closeness of the top two lines suggests that no meaningful difference in behavior exists between the 20th and 80th percentiles. This can be conveyed to readers through a short note. Instead of displaying all 100 percentiles, the line chart selectively includes only the 99th , 95th, 90th, 80th and 20th percentiles. This is a design choice that adds by subtraction.

Along the time axis, the line chart provides more granularity than either the map or the dot plot. The exit occurred roughly over the last two weeks of March and the first week of April. The start coincided with New York's stay-at-home advisory.

This third chart is a statistical graphic. It does not bring out the raw data but features aggregated and smoothed data designed to reveal a key message.

I encourage you to also study the annotated table later in the article. It shows the power of a well-designed table.

[P.S. 6/4/2020. On the book blog, I have just published a post about the underlying surveillance data for this type of analysis.]