Bubble charts, ratios and proportionality

A recent article in the Wall Street Journal about a challenger to the dominant weedkiller, Roundup, contains a nice selection of graphics. (Dicamba is the up-and-comer.)

Wsj_roundup_img1


The change in usage of three brands of weedkillers is rendered as a small-multiples of choropleth maps. This graphic displays geographical and time changes simultaneously.

The staircase chart shows weeds have become resistant to Roundup over time. This is considered a weakness in the Roundup business.

***

In this post, my focus is on the chart at the bottom, which shows complaints about Dicamba by state in 2019. This is a bubble chart, with the bubbles sorted along the horizontal axis by the acreage of farmland by state.

Wsj_roundup_img2

Below left is a more standard version of such a chart, in which the bubbles are allowed to overlap. (I only included the bubbles that were labeled in the original chart).

Redo_roundupwsj0

The WSJ’s twist is to use the vertical spacing to avoid overlapping bubbles. The vertical axis serves a design perogative and does not encode data.  

I’m going to stick with the more traditional overlapping bubbles here – I’m getting to a different matter.

***

The question being addressed by this chart is: which states have the most serious Dicamba problem, as revealed by the frequency of complaints? The designer recognizes that the amount of farmland matters. One should expect the more acres, the more complaints.

Let's consider computing directly the number of complaints per million acres.

The resulting chart (shown below right) – while retaining the design – gives a wholly different feeling. Arkansas now owns the largest bubble even though it has the least acreage among the included states. The huge Illinois bubble is still large but is no longer a loner.

Redo_dicambacomplaints1

Now return to the original design for a moment (the chart on the left). In theory, this should work in the following manner: if complaints grow purely as a function of acreage, then the bubbles should grow proportionally from left to right. The trouble is that proportional areas are not as easily detected as proportional lengths.

The pair of charts below depict made-up data in which all states have 30 complaints for each million acres of farmland. It’s not intuitive that the bubbles on the left chart are growing proportionally.

Redo_dicambacomplaints2

Now if you look at the right chart, which shows the relative metric of complaints per million acres, it’s impossible not to notice that all bubbles are the same size.


Revisiting global car sales

We looked at the following chart in the previous blog. The data concern the growth rates of car sales in different regions of the world over time.

Cnbc zh global car sales

Here is a different visualization of the same data.

Redo_cnbc_globalcarsales

Well, it's not quite the same data. I divided the global average growth rate by four to yield an approximation of the true global average. (The reason for this is explained in the other day's post.)

The chart emphasizes how each region was helping or hurting the global growth. It also features the trend in growth within each region.

 


This Excel chart looks standard but gets everything wrong

The following CNBC chart (link) shows the trend of global car sales by region (or so we think).

Cnbc zh global car sales

This type of chart is quite common in finance/business circles, and has the fingerprint of Excel. After examining it, I nominate it for the Hall of Shame.

***

The chart has three major components vying for our attention: (1) the stacked columns, (2) the yellow line, and (3) the big red dashed arrow.

The easiest to interpret is the yellow line, which is labeled "Total" in the legend. It displays the annual growth rate of car sales around the globe. The data consist of annual percentage changes in car sales, so the slope of the yellow line represents a change of change, which is not particularly useful.

The big red arrow is making the point that the projected decline in global car sales in 2019 will return the world to the slowdown of 2008-9 after almost a decade of growth.

The stacked columns appear to provide a breakdown of the global growth rate by region. Looked at carefully, you'll soon learn that the visual form has hopelessly mangled the data.

Cnbc_globalcarsales_2006

What is the growth rate for Chinese car sales in 2006? Is it 2.5%, the top edge of China's part of the column? Between 1.5% and 2.5%, the extant of China's section? The answer is neither. Because of the stacking, China's growth rate is actually the height of the relevant section, that is to say, 1 percent. So the labels on the vertical axis are not directly useful to learning regional growth rates for most sections of the chart.

Can we read the vertical axis as global growth rate? That's not proper either. The different markets are not equal in size so growth rates cannot be aggregated by simple summing - they must be weighted by relative size.

The negative growth rates present another problem. Even if we agree to sum growth rates ignoring relative market sizes, we still can't get directly to the global growth rate. We would have to take the total of the positive rates and subtract the total of the negative rates.  

***

At this point, you may begin to question everything you thought you knew about this chart. Remember the yellow line, which we thought measures the global growth rate. Take a look at the 2006 column again.

The global growth rate is depicted as 2 percent. And yet every region experienced growth rates below 2 percent! No matter how you aggregate the regions, it's not possible for the world average to be larger than the value of each region.

For 2006, the regional growth rates are: China, 1%; Rest of the World, 1%; Western Europe, 0.1%; United States, -0.25%. A simple sum of those four rates yields 2%, which is shown on the yellow line.

But this number must be divided by four. If we give the four regions equal weight, each is worth a quarter of the total. So the overall average is the sum of each growth rate weighted by 1/4, which is 0.5%. [In reality, the weights of each region should be scaled to reflect its market size.]

***

tldr; The stacked column chart with a line overlay not only fails to communicate the contents of the car sales data but it also leads to misinterpretation.

I discussed several serious problems of this chart form: 

  • stacking the columns make it hard to learn the regional data

  • the trend by region takes a super effort to decipher

  • column stacking promotes reading meaning into the height of the column but the total height is meaningless (because of the negative section) while the net height (positive minus negative) also misleads due to presumptive equal weighting

  • the yellow line shows the sum of the regional data, which is four times the global growth rate that it purports to represent

 

***

PS. [12/4/2019: New post up with a different visualization.]


Graph literacy, in a sense

Ben Jones tweeted out this chart, which has an unusual feature:

Malefemaleliteracyrates

What's unusual is that time runs in both directions. Usually, the rule is that time runs left to right (except, of course, in right-to-left cultures). Here, the purple area chart follows that convention while the yellow area chart inverts it.

On the one hand, this is quite cute. Lines meeting in the middle. Converging. I get it.

On the other hand, every time a designer defies conventions, the reader has to recognize it, and to rationalize it.

In this particular graphic, I'm not convinced. There are four numbers only. The trend on either side looks linear so the story is simple. Why complicate it using unusual visual design?

Here is an entirely conventional bumps-like chart that tells the story:

Redo_literacyratebygender

I've done a couple of things here that might be considered controversial.

First, I completely straightened out the lines. I don't see what additional precision is bringing to the chart.

Second, despite having just four numbers, I added the year 1996 and vertical gridlines indicating decades. A Tufte purist will surely object.

***

Related blog post: "The Return on Effort in Data Graphics" (link)


Does this chart tell the sordid tale of TI's decline?

The Hustle has an interesting article on the demise of the TI calculator, which is popular in business circles. The article uses this bar chart:

Hustle_ti_calculator_chart

From a Trifecta Checkup perspective, this is a Type DV chart. (See this guide to the Trifecta Checkup.)

The chart addresses a nice question: is the TI graphing calculator a victim of new technologies?

The visual design is marred by the use of the calculator images. The images add nothing to our understanding and create potential for confusion. Here is a version without the images for comparison.

Redo_junkcharts_hustlet1calc

The gridlines are placed to reveal the steepness of the decline. The sales in 2019 will likely be half those of 2014.

What about the Data? This would have been straightforward if the revenues shown are sales of the TI calculator. But according to the subtitle, the data include a whole lot more than calculators - it's the "other revenues" category in the financial reports of Texas Instrument which markets the TI. 

It requires a leap of faith to believe this data. It is entirely possible that TI calculator sales increased while total "other revenues" decreased! The decline of TI calculator could be more drastic than shown here. We simply don't have enough data to say for sure.

 

P.S. [10/3/2019] Fixed TI.

 

 


The windy path to the Rugby World Cup

When I first saw the following chart, I wondered whether it is really that challenging for these eight teams to get into the Rugby World Cup, currently playing in Japan:

1920px-2019_Rugby_World_Cup_Qualifying_Process_Diagram.svg

Another visualization of the process conveys a similar message. Both of these are uploaded to Wikipedia.

Rugby_World_Cup_2019_Qualification_illustrated_v2

(This one hasn't been updated and still contains blank entries.)

***

What are some of the key messages one would want the dataviz to deliver?

  • For the eight countries that got in (not automatically), track their paths to the World Cup. How many competitions did they have to play?
  • For those countries that failed to qualify, track their paths to the point that they were stopped. How many competitions did they play?
  • What is the structure of the qualification rounds? (These are organized regionally, in addition to certain playoffs across regions.)
  • How many countries had a chance to win one of the eight spots?
  • Within each competition, how many teams participated? Did the winner immediately qualify, or face yet another hurdle? Did the losers immediately disqualify, or were they offered another chance?

Here's my take on this chart:

Rugby_path_to_world_cup_sm

 


Tennis greats at the top of their game

The following chart of world No. 1 tennis players looks pretty but the payoff of spending time to understand it isn't high enough. The light colors against the tennis net backdrop don't work as intended. The annotation is well done, and it's always neat to tug a legend inside the text.

Tableautennisnumberones

The original is found at Tableau Public (link).

The topic of the analysis appears to be the ages at which tennis players attained world #1 ranking. Here are the male players visualized differently:

Redo_junkcharts_no1tennisplayers

Some players like Jimmy Connors and Federer have second springs after dominating the game in their late twenties. It's relatively rare for players to get to #1 after 30.


Choosing between individuals and aggregates

Friend/reader Thomas B. alerted me to this paper that describes some of the key chart forms used by cancer researchers.

It strikes me that many of the "new" charts plot granular data at the individual level. This heatmap showing gene expressions show one column per patient:

Jnci_genemap

This so-called swimmer plot shows one bar per patient:

Jnci_swimlanes

This spider plot shows the progression of individual patients over time. Key events are marked with symbols.

Jnci_spaghetti

These chart forms are distinguished from other ones that plot aggregated statistics: statistical averages, medians, subgroup averages, and so on.

One obvious limitation of such charts is their lack of scalability. The number of patients, the variability of the metric, and the timing of trends all drive up the amount of messiness.

I am left wondering what Question is being addressed by these plots. If we are concerned about treatment of an individual patient, then showing each line by itself would be clearer. If we are interested in the average trends of patients, then a chart that plots the overall average, or subgroup averages would be more accurate. If the interpretation of the individual's trend requires comparing with similar patients, then showing that individual's line against the subgroup average would be preferred.

When shown these charts of individual lines, readers are tempted to play the statistician - without using appropriate tools! Readers draw aggregate conclusions, performing the aggregation in their heads.

The authors of the paper note: "Spider plots only provide good visual qualitative assessment but do not allow for formal statistical inference." I agree with the second part. The first part is a fallacy - if the visual qualitative assessment is good enough, then no formal inference is necessary! The same argument is often made when people say they don't need advanced analysis because their simple analysis is "directionally accurate". When is something "directionally inaccurate"? How would one know?

Reference: Chia, Gedye, et. al., "Current and Evolving Methods to Visualize Biological Data in Cancer Research", JNCI, 2016, 108(8). (link)

***

Meteoreologists, whom I featured in the previous post, also have their own spider-like chart for hurricanes. They call it a spaghetti map:

Dorian_spaghetti

Compare this to the "cone of uncertainty" map that was featured in the prior post:

AL052019_5day_cone_with_line_and_wind

These two charts build upon the same dataset. The cone map, as we discussed, shows the range of probable paths of the storm center, based on all simulations of all acceptable models for projection. The spaghetti map shows selected individual simulations. Each line is the most likely trajectory of the storm center as predicted by a single simulation from a single model.

The problem is that each predictive model type has its own historical accuracy (known as "skill"), and so the lines embody different levels of importance. Further, it's not immediately clear if all possible lines are drawn so any reader making conclusions of, say, the envelope containing x percent of these lines is likely to be fooled. Eyeballing the "cone" that contains x percent of the lines is not trivial either. We tend to naturally drift toward aggregate statistical conclusions without the benefit of appropriate tools.

Plots of individuals should be used to address the specific problem of assessing individuals.


As Dorian confounds meteorologists, we keep our minds clear on hurricane graphics, and discover correlation as our friend

As Hurricane Dorian threatens the southeastern coast of the U.S., forecasters are fretting about the lack of consensus among various predictive models used to predict the storm’s trajectory. The uncertainty of these models, as reflected in graphical displays, has been a controversial issue in the visualization community for some time.

Let’s start by reviewing a visual design that has captured meteorologists in recent years, something known as the cone map.

Charley_oldconemap

If asked to explain this map, most of us trace a line through the middle of the cone understood to be the center of the storm, the “cone” as the areas near the storm center that are affected, and the warmer colors (red, orange) as indicating higher levels of impact. [Note: We will  design for this type of map circa 2000s.]

The above interpretation is complete, and feasible. Nevertheless, the data used to make the map are forward-looking, not historical. It is still possible to stick to the same interpretation by substituting historical measurement of impact with its projection. As such, the “warmer” regions are projected to suffer worse damage from the storm than the “cooler” regions (yellow).

After I replace the text that was removed from the map (see below), you may notice the color legend, which discloses that the colors on the map encode probabilities, not storm intensity. The text further explains that the chart shows the most probable path of the center of the storm – while the coloring shows the probability that the storm center will reach specific areas.

Charley_oldconemap

***

When reading a data graphic, we rarely first look for text about how to read the chart. In the case of the cone map, those who didn’t seek out the instructions may form one of these misunderstandings:

  1. For someone living in the yellow-shaded areas, the map does not say that the impact of the storm is projected to be lighter; it’s that the center of the storm has a lower chance of passing right through. If, however, the storm does pay a visit, the intensity of the winds will reach hurricane grade.
  2. For someone living outside the cone, the map does not say that the storm will definitely bypass you; it’s that the chance of a direct hit is below the threshold needed to show up on the cone map. Thee threshold is set to attain 66% accurate. The actual paths of storms are expected to stay inside the cone two out of three times.

Adding to the confusion, other designers have produced cone maps in which color is encoding projections of wind speeds. Here is the one for Dorian.

AL052019_wind_probs_64_F120

This map displays essentially what we thought the first cone map was showing.

One way to differentiate the two maps is to roll time forward, and imagine what the maps should look like after the storm has passed through. In the wind-speed map (shown below right), we will see a cone of damage, with warmer colors indicating regions that experienced stronger winds.

Projectedactualwinds_irma

In the storm-center map (below right), we should see a single curve, showing the exact trajectory of the center of the storm. In other words, the cone of uncertainty dissipates over time, just like the storm itself.

Projectedactualstormcenter_irma

 

After scientists learned that readers were misinterpreting the cone maps, they started to issue warnings, and also re-designed the cone map. The cone map now comes with a black-box health warning right up top. Also, in the storm-center cone map, color is no longer used. The National Hurricane Center even made a youtube pointing out the dos and donts of using the cone map.

AL052019_5day_cone_with_line_and_wind

***

The conclusion drawn from misreading the cone map isn’t as devastating as it’s made out to be. This is because the two issues are correlated. Since wind speeds are likely to be stronger nearer to the center of the storm, if one lives in a region that has a low chance of being a direct hit, then that region is also likely to experience lower average wind speeds than those nearer to the projected center of the storm’s path.

Alberto Cairo has written often about these maps, and in his upcoming book, How Charts Lie, there is a nice section addressing his work with colleagues at the University of Miami on improving public understanding of these hurricane graphics. I highly recommended Cairo’s book here.

P.S. [9/5/2019] Alberto also put out a post about the hurricane cone map.

 

 

 


Women workers taken for a loop or four

I was drawn to the following chart in Business Insider because of the calendar metaphor. (The accompanying article is here.)

Businessinsider_payday

Sometimes, the calendar helps readers grasp concepts faster but I'm afraid the usage here slows us down.

The underlying data consist of just four numbers: the wage gaps between race and gender in the U.S., considered simply from an aggregate median personal income perspective. The analyst adopts the median annual salary of a white male worker as a baseline. Then, s/he imputes the number of extra days that others must work to attain the same level of income. For example, the median Asian female worker must work 64 extra days (at her daily salary level) to match the white guy's annual pay. Meanwhile, Hispanic female workers must work 324 days extra.

There are a host of reasons why the calendar metaphor backfired.

Firstly, it draws attention to an uncomfortable detail of the analysis - which papers over the fact that weekends or public holidays are counted as workdays. The coloring of the boxes compounds this issue. (And the designer also got confused and slipped up when applying the purple color for Hispanic women.)

Secondly, the calendar focuses on Year 2 while Year 1 lurks in the background - white men have to work to get that income (roughly $46,000 in 2017 according to the Census Bureau).

Thirdly, the calendar view exposes another sore point around the underlying analysis. In reality, the white male workers are continuing to earn wages during Year 2.

The realism of the calendar clashes with the hypothetical nature of the analysis.

***

One can just use a bar chart, comparing the number of extra days needed. The calendar design can be considered a set of overlapping bars, wrapped around the shape of a calendar.

The staid bars do not bring to life the extra toil - the message is that these women have to work harder to get the same amount of pay. This led me to a different metaphor - the white men got to the destination in a straight line but the women must go around loops (extra days) before reaching the same endpoint.

Redo_businessinsider_racegenderpaygap

While the above is a rough sketch, I made sure that the total length of the lines including the loops roughly matches the total number of days the women needed to work to earn $46,000.

***

The above discussion focuses solely on the V(isual) corner of the Trifecta Checkup, but this data visualization is also interesting from the D(ata) perspective. Statisticians won't like such a simple analysis that ignores, among other things, the different mix of jobs and industries underlying these aggregate pay figures.

Now go to my other post on the sister (book) blog for a discussion of the underlying analysis.