Sep 28, 2021

In the prior post about Canadian elections, I suggested that designers expand beyond plots of one variable at a time. Today, I look at a project by DataWrapper on the German elections which happened this week. Thanks to long-time blog supporter Antonio for submitting the chart.

The following is the centerpiece of Lisa's work:

CDU/CSU is Angela Merkel's party, represented by the black color. The chart answers one question only: did polls correctly predict election results?

The time period from 1994 to 2021 covers eight consecutive elections (counting the one this week). There are eight vertical blocks on the chart representing each administration. The right vertical edge of each block coincides with an election. The chart is best understood as the superposition of two time series.

You can trace the first time series by following a step function - let your eyes follow the flat lines between elections. This dataset shows the popular vote won by the party at each election, with the value updated after each election. The last vertical block represents an election that has not yet happened when this chart was created. As explained in the footnote, Lisa took the average poll result for the last month leading up to the 2021 election - in the context of this chart, she made the assumption that this cycle of polls will be 100% accurate.

The second time series corresponds to the ragged edges of the gray and black areas. If you ignore the colors, and the flat lines, you'll discover that the ragged edges form a contiguous data series. This line encodes the average popularity of the CDU/CSU party according to election polls.

Thus, the area between the step function and the ragged line measures the gap between polls and election day results. When the polls underestimate the actual outcome, the area is colored gray; when the polls are over-optimistic, the area is colored black. In the last completed election of 2017, Merkel's party underperformed relative to the polls. In fact, the polls in the entire period between the 2013 and 2017 uniformly painted a rosier picture for CDU/CSU than actually happened.

The last vertical block is interpreted a little differently. Since the reference level is the last month of polls (rather than the actual popular vote), the abundance of black indicates that Merkel's party has been suffering from declining poll numbers on the approach of this week's election.

***

The picture shown above seems to indicate that these polls are not particularly good. It appears they have limited ability to self-correct within each election cycle. Aside from the 1998-2002 period, the area colors seldom changed within each cycle. That means if the first polling average overestimated the party's popularity, then all subsequent polling averages were also optimistic. (The original post focused on a single pollster, which exacerbates this issue. Compare the following chart with the above, and you'll find even fewer color changes within cycle here:

Each pollster may be systematically biased but the poll aggregate is less so.)

Here's the chart for SDP, which is CDU/CSU's biggest opponent, and likely winner of this week's election:

Overall, this chart has similar features as the CDU/CSU chart. The most recent polls seem to favor the SPD - the pink area indicates that the older polls of this cycle underestimates the last month's poll result.

Both these parties are in long-term decline, with popularity dropping from the 40% range in the 1990s to the 20% range in the 2020s.

One smaller party that seems to have gained followers is the Green party:

The excess of dark green, however, does not augur well for this election.

Sep 20, 2021

Stephen Taylor reached out to me about his work to visualize Canadian elections data. I took a look. I appreciate the labor of love behind this project.

He led with a streamgraph, which presents a quick overview of relative party strengths over time.

I am no Canadian election expert, and I did a bare minimum of research in writing this blog. From this chart, I learn that:

• the Canadians have an irregular election schedule
• The two dominant parties are Liberals and Conservatives. The Liberals currently hold just less than half of the seats. The Conservatives have more than half of the seats not held by Liberals
• The Conservative party (maybe) rebranded as "progressive conservative" for several decades. The Reform/Alliance party was (maybe) a splinter movement within the Conservatives as well.
• Since the "width" of the entire stream increased over time, I'm guessing the number of seats has expanded

That's quite a bit of information obtained at a glance. This shows the power of data visualization. Notice Stephen didn't even have to include a "how to read this" box.

The streamgraph form has its limitations.

The feature that makes it more attractive than an area chart is its middle anchoring, resulting in a form of symmetry. The same feature produces erroneous intuition - the red patch draws out a declining trend; the reader must fight the urge to interpret the lines and focus on the areas.

The breadcrumbs are well hidden. The legend below discloses that the Green Party holds 3 seats currently. The party has never held enough seats to appear on the streamgraph though.

The bars showing proportions in the legend is a very nice touch. (The numbers appear messed up - I have to ask Stephen whether the seats shown are current values, or some kind of historical average.) I am a big fan of informative legends.

***

The next featured chart is a dot plot of polling results since 2020.

One can see a three-tier system: the two main parties, then the NDP (yellow) is the clear majority of the minority, and finally you have a host of parties that don't poll over 10%.

It looks like the polls are favoring the Conservatives over the Liberals in this election but it may be an election-day toss-up.

The purple dots represent "PPC" which is a party not found elsewhere on the page.

This chart is clear as crystal because of the structure of the underlying data. It just amazes me that the polls are so highly correlated. For example, across all these polls, the NDP has never once polled better than either the Liberals or the Conservatives, and in addition, it has never polled worse than any of the small parties.

What I'd like to see is a chart that merges the two datasets, addressing the question of how well these polls predicted the actual election outcomes.

***

The project goes very deep as Stephen provides charts for individual "ridings" (perhaps similar to U.S. precincts).

Here we see population pyramids for Vancouver Center, versus British Columbia (Province), versus Canada.

This riding has a large surplus of younger people in their twenties and thirties. Be careful about the changing scales though. The relative difference in proportions are more drastic than visually displayed because the maximum values (5%) on the Province and Canada charts are half that on the Riding chart (10%). Imagine squashing the Province and Canada charts to half their widths.

Analyses of income and rent/own status are also provided.

This part of the dashboard exhibits a problem common in most dashboards - they present each dimension of the data separately and miss out on the more interesting stuff: the correlation between dimensions. Do people in their twenties and thirties favor specific parties? Do richer people vote for certain parties?

***

The riding-level maps are the least polished part of the site. This is where I'm looking for a "how to read it" box.

It took me a while to realize that the colors represent the parties. If I haven't come in from the front page, I'd have been totally lost.

Next, I got confused by the use of the word "poll". Clicking on any of the subdivisions bring up details of an actual race, with party colors, candidates and a donut chart showing proportions. The title gives a "poll id" and the name of the riding in parentheses. Since the poll id changes as I mouse over different subdivisions, I'm wondering whether a "poll" is the term for a subdivision of a riding. A quick wiki search indicates otherwise.

My best guess is the subdivisions are indicated by the numbers.

Back to the donut charts, I prefer a different sorting of the candidates. For this chart, the two most logical orderings are (a) order by overall popularity of the parties, fixed for all ridings and (b) order by popularity of the candidate, variable for each riding.

The map shown above gives the winner in each subdivision. This type of visualization dumps a lot of information. Stephen tackles this issue by offering a small multiples view of each party. Here is the Liberals in Vancouver.

Again, we encounter ambiguity about the color scheme. Liberals have been associated with a red color but we are faced with abundant yellow. After clicking on the other parties, you get the idea that he has switched to a divergent continuous color scale (red - yellow - green). Is red or green the higher value? (The answer is red.)

I'd suggest using a gray scale for these charts. The hardest decision is going to be the encoding between values and shading. Should each gray scale be different for each riding and each party?

If I were to take a guess, Stephen must have spent weeks if not months creating these maps (depending on whether he's full-time or part-time). What he has published here is a great start. Fine-tuning the issues I've mentioned may take more weeks or months more.

****

Stephen is brave and smart to send this project for review. For one thing, he's got some free consulting. More importantly, we should always send work around for feedback; other readers can tell us where our blind spots are.

Tongue in cheek but a master stroke

Sep 17, 2021

Andrew jumped on the Benford bandwagon to do a tongue-in-cheek analysis of numbers in Hollywood movies (link). The key graphic is this:

Benford's Law is frequently invoked to prove (or disprove) fraud with numbers by examining the distribution of first digits. Andrew extracted movies that contain numbers in their names - mostly but not always sequences of movies with sequels. The above histogram (gray columns) are the number of movies with specific first digits. The red line is the expected number if Benford's Law holds. As typical of such analysis, the histogram is closely aligned with the red line, and therefore, he did not find any fraud.

I'll blog about my reservations about Benford-style analysis on the book blog later - one quick point is: as with any statistical analysis, we should say there is no statistical evidence of fraud (more precisely, of the kind of fraud that can be discovered using Benford's Law), which is different from saying there is no fraud.

***

Andrew also showed a small-multiples chart that breaks up the above chart by movie groups. I excerpted the top left section of the chart below:

The genius in this graphic is easily missed.

Notice that the red lines (which are the expected values if Benford Law holds) appear identical on every single plot. And then notice that the lines don't represent the same values.

It's great to have the red lines look the same everywhere because they represent the immutable Benford reference. Because the number of movies is so small, he's plotting counts instead of proportions. If you let the software decide on the best y-axis range for each plot, the red lines will look different on different charts!

You can find the trick in the R code from Gelman's blog.

First, the maximum value of each plot is set to the total number of observations. Then, the expected Benford proportions are converted into expected Benford counts. The first Benford count is then shown against an axis topping out at the total count, and thus, relatively, what we are seeing are the Benford proportions. Thus, every red line looks the same despite holding different values.

This is a master stroke.

A little stitch here, a great graphic is knitted

Sep 15, 2021

The Wall Street Journal used the following graphic to compare hurricanes Ida and Katrina (link to paywalled article).

This graphic illustrates the power of visual communications. Readers can learn a lot from it.

The paths of the storms can be compared. The geographical locations of the landfalls are shown. The strengthening of wind speeds as the hurricanes moved toward Louisiana is also displayed. Ida is clearly a lesser storm than Katrina: its wind speed never reached Category 5, and is generally lower at comparable time points.

The greatest feature of the WSJ graphic is how the designer stitches the two plots into one graphic. The anchors are two time points: when each storm attained enough wind speed to be classified as a hurricane (indicated by open dots), and when each storm made landfall in Louisiana. It is this little-noticed feature that makes it so easy to place each plot in context of the other.

Bravo!

Visually displaying multipliers

Sep 08, 2021

As I'm preparing a blog about another real-world study of Covid-19 vaccines, I came across the following chart (the chart title is mine).

As background, this is the trend in Covid-19 cases in the U.K. in the last couple of months, courtesy of OurWorldinData.org.

The React-1 Study sends swab kits to randomly selected people in England in order to assess the prevalence of Covid-19. Every month, there is a new round of returned swabs that are tested for Covid-19. This measurement method captures asymptomatic cases although it probably missed severe and hospitalized cases. Despite having some shortcomings, this is a far better way to measure cases than the hotch-potch assembling of variable-quality data submitted by different jurisdictions that has become the dominant source of our data.

Rounds 12 and 13 captured an inflection point in the pandemic in England. The period marked the beginning of the end of the belief that widespread vaccination will end the pandemic.

The chart I excerpted up top broke the data down by age groups. The column heights represent the estimated prevalence of Covid-19 during each round - also, described precisely in the paper as "swab positivity." Based on the study's design, one may generalize the prevalence to the population at large. About 1.5% of those aged 13-24 in England are estimated to have Covid-19 around the time of Round 13 (roughly early July).

The researchers came to the following conclusion:

We show that the third wave of infections in England was being driven primarily by the Delta variant in younger, unvaccinated people. This focus of infection offers considerable scope for interventions to reduce transmission among younger people, with knock-on benefits across the entire population... In our data, the highest prevalence of infection was among 12 to 24 year olds, raising the prospect that vaccinating more of this group by extending the UK programme to those aged 12 to 17 years could substantially reduce transmission potential in the autumn when levels of social mixing increase

***

Raise your hand if the graphics software you prefer dictates at least one default behavior you can't stand. I'm sure most hands are up in the air. No matter how much you love the software, there is always something the developer likes that you don't.

The first thing I did with today's chart is to get rid of all such default details.

For me, the bottom chart is cleaner and more inviting.

***

The researchers wanted readers to think in terms of Round 3 numbers as multiples of Round 2 numbers. In the text, they use statements such as:

weighted prevalence in round 13 was nine-fold higher in 13-17 year olds at 1.56% (1.25%, 1.95%) compared with 0.16% (0.08%, 0.31%) in round 12

It's not easy to perceive a nine-fold jump from the paired column chart, even though this chart form is better than several others. I added some subtle divisions inside each orange column in order to facilitate this task:

I have recommended this before. I'm co-opting pictograms in constructing the column chart.

An alternative is to plot everything on an index scale although one would have to drop the prevalence numbers.

***

The chart requires an additional piece of context to interpret properly. I added each age group's share of the population below the chart - just to illustrate this point, not to recommend it as a best practice.

The researchers concluded that their data supported vaccinating 13-17 year olds because that group experienced the highest multiplier from Round 12 to Round 13. Notice that the 13-17 year old age group represents only 6 percent of England's population, and is the least populous age group shown on the chart.

The neighboring 18-24 age group experienced a 4.5 times jump in prevalence in Round 13 so this age group is doing much better than 13-17 year olds, right? Not really.

While the same infection rate was found in both age groups during this period, the slightly older age group accounted for 50% more cases -- and that's due to the larger share of population.

A similar calculation shows that while the infection rate of people under 24 is about 3 times higher than that of those 25 and over, both age groups suffered over 175,000 infections during the Round 3 time period (the difference between groups was < 4,000).  So I don't agree that focusing on 13-17 year olds gives England the biggest bang for the buck: while they are the most likely to get infected, their cases account for only 14% of all infections. Almost half of the infections are in people 25 and over.

Working hard at clarity

Sep 02, 2021

As I am preparing another blog post about the pandemic, I came across the following data graphic, recently produced by the CDC for a vaccine advisory board meeting:

This is not an example of effective visual communications.

***

For one thing, readers are directed to scour the footnotes to figure out what's going on. If we ignore those for the moment, we see clusters of bubbles that have remained pretty stable from December 2020 to August 2021. The data concern some measure of Americans' intent to take the COVID-19 vaccine. That much we know.

There may have been a bit of an upward trend between January and May, although if you were shown the clusters for December, February and April, you'd think the trend's been pretty flat.

***

But those colors? What could they represent? You'd surely have to fish this one out of the footnotes. Specifically, this obtuse sentence: "Surveys with multiple time points are shown with the same color bubble for each time point." I had to read it several times. I think it simply means "Color represents the pollster."

Then it adds: "Surveys with only one time point are shown in gray." which simply means "All pollsters who have only one entry in the dataset are grouped together and shown in gray."

Another problem with this chart is over-plotting. Look at the July cluster. It's impossible to tell how many polls were conducted in July because the circles pile on top of one another.

***

The appearance of the flat trend is a result of two unfortunate decisions made by the designer. If I retained the chart form, I'd have produced something that looks like this:

The first design choice is to expand the vertical axis to range from 0% to 100%. This effectively squeezes all the bubbles into a small range.

The second design choice is to enlarge the bubbles causing copious amount of overlapping.

In particular, this decision blows up the Pew poll (big pink bubble) that contained 10 times the sample size of most of the other polls. The Pew outcome actually came in at 70% but the top of the pink bubble extends to over 80%. Because of this, the outlier poll of December 2020 - which surprisingly printed the highest number of all polls in the entire time window - no longer looks special.

***

Now, let's see what else we can do to enhance this chart.

I don't like how bubble size is used to encode the sample size. It creates a weird sensation for anyone who's familiar with sampling errors, and confidence regions. The Pew poll with 10 times the sample size is the most reliable poll of them all. Reliability means the error bars around the Pew poll outcome is the smallest of them all. I tend to think of the area around a point estimate as showing the sampling error so the Pew poll would be a dot, showing the high precision of that estimate.

But that won't work because larger bubbles catch more of the reader's attention. So, in the following version, all dots have the same size. I encode reliability in the opacity of the color. The darker dots are polls that are more reliable, that have larger sample sizes.

Two of the pollsters have more frequent polling than others. In this next version, I highlighted those two, which reveals the trend better.