« November 2016 | Main | January 2017 »

Chopped legs, and abridged analyses

Reader Glenn T. was not impressed by the graphical talent on display in the following column chart (and others) in a Monkey Cage post in the Washington Post:


Not starting column charts at zero is like having one's legs chopped off. Here's an animated gif to show what's taking place: (you may need to click on it to see the animation)


Since all four numbers show up on the chart itself, there is no need to consult the vertical axis.

I wish they used a structured color coding to help fast comprehension of the key points.


These authors focus their attention on the effect of the "black or white cue" but the other effect of Trump supporters vs. non-supporters is many times as big.

Notice that on average 56% of Trump supporters in this study oppose mortgage assistance while 25% of non Trump supporters oppose it - a gap of about 30%.

If we are to interpret the roughly +/- 5% swing attributed to black/white cues as "racist" behavior on the part of Trump supporters, then the +/- 3% swing on the part of non-Trump supporters in the other direction should be regarded as a kind of "reverse racist" behavior. No?

So from this experiment, one should not conclude that Trump voters are racist, which is what the authors are implying. Trump voters have many reasons to oppose mortgage assistance, and racist reaction to pictures of black and white people has only a small part of play in it.


The reporting of the experimental results irks me in other ways.

The headline claimed that "we showed Trump voters photos of black and white Americans." That is a less than accurate description of the experiment and subsequent analysis. The authors removed all non-white Trump voters from the analysis, so they are only talking about white Trump voters.

Also, I really, really dislike the following line:

When we control for age, income, sex, education, party identification, ideology, whether the respondent was unemployed, and perceptions of the national economy — other factors that might shape attitudes about mortgage relief — our results were the same.                        

Those are eight variables they looked into for which they provided zero details. If they investigated "interaction" effects, only of pairs of variables, that would add another 28 dimensions for which they provided zero information.

The claim that "our results were the same" tells me nothing! It is hard for me to imagine that the set of 8+28 variables described above yielded exactly zero insights.

Even if there were no additional insights, I would still like to see the more sophisticated analysis that controls for all those variables that, as they admitted, shape attitudes about mortgage relief. After all, the results are "the same" so the researcher should be indifferent between the simple and the sophisticated analyses.

In the old days of printed paper, I can understand why journal editors are reluctant to print all those analyses. In the Internet age, we should put those analyses online, providing a link to supplementary materials for those who want to dig deeper.


On average, 56 percent of white Trump voters oppose mortgage relief. Add another 3-5 percent (rounding error) if they were cued with an image of a black person. The trouble here is that 90% of the white Trump voting respondents could have been unaffected by the racial cue and the result still holds.

While the effect may be "statistically significant" (implied but not stated by the authors), it represents a small shift in the average attitude. The fact that the "average person" responded to the racial cue does not imply that most people responded to it.

The last two issues I raised here are not specific to this particular study. They are prevalent in the reporting of psychological experiments.


Is this chart rotten?

Some students pointed me to a FiveThirtyEight article about Rotten Tomatoes scores that contain the following chart: (link to original)


This is a chart that makes my head spin. Too much is going on, and all the variables in the plot are tangled with each other. Even after looking at it for a while, I still don't understand how the author looked at the above and drew this conclusion:

"Movies that end up in the top tier miss a step ahead of their release, mediocre movies stumble, and the bottom tiers fall down an elevator shaft."

(Here is the article. It's a great concept but a bit disappointing analysis coming from Nate Silver's site. I have written features for them before so I know they ask good questions. Maybe they should apply the same level of rigor in editing feature writers to editing staff writers.)

Story within story, bar within bar

This Wall Street Journal offering caught my eye.


It's the unusual way of displaying proportions.

Your first impression is to interpret the graphic as a bar chart. But it really is a bar within a bar: the crux of the matter - gender balance - is embedded in individual bars.

Instead of pie charts or stacked bar charts, we see  stacked columns within each bar.

I see what the designer is attempting to accomplish. The first message is the sharp decline in gender equality at higher job titles. The next message is the sharp drop in the frequency of higher job titles.

This chart is a variant of the "Marimekko" chart (beloved by management consultants), also called the mosaic chart. The only difference being how the distribution of jobs in the work force is coded.

The Marimekko is easier to understand:


A key advantage of this version is to be found in the thin columns.

Here is another way to visualize this data, drawing attention to the gender gap.


In the other versions, the reader must do subtractions to figure out the size of the gaps.

Round things, square things

The following chart traces the flow of funds into AI (artificial intelligence) startups.


I found it on this webpage and it is attributed to Financial Times.

Here, I apply the self-sufficiency test to show that the semicircles are playing no role in the visualization. When the numbers are removed, readers cannot understand the data at all. So the visual elements are toothless.


Actually, it's worse. The data got encoded in the diameters of the semicircles, but not the areas. Thus, anyone courageously computing the ratios of the areas finds their effort frustrated.

Here is a different view that preserves the layout:


The two data series in the original chart show the current round of funding and the total funds raised. In the junkcharts version, I decided to compare the new funds versus the previously-raised funds so that the total area represents the total funds raised.

Sorting out the data, and creating the head-shake manual

Yesterday's post attracted a few good comments.

Several readers don't like the data used in the NAEP score chart. The authors labeled the metric "gain in NAEP scale scores" which I interpreted to be "gain scores," a popular way of evaluating educational outcomes. A gain score is the change in test score between (typically consecutive) years. I also interpreted the label "2000-2009" as the average of eight gain scores, in other words, the average year-on-year change in test scores during those 10 years.

After thinking about what reader mankoff wrote, which prompted me to download the raw data, I realized that the designer did not compute gain scores. "2000-2009" really means the difference between the 2009 score and the 2000 score, ignoring all values between those end points. So mankoff is correct in saying that the 2009 number was used in both "2000-2009" and "2009-2015" computations.

This treatment immediately raises concerns. Why is a 10-year period compared to a 7-year period?

Andrew prefers to see the raw scores ("scale scores") instead of relative values. Here is the corresponding chart:


I placed a line at 2009, just to see if there is a reason for that year to be a special year. (I don't think so.) The advantage of plotting raw scores is that it is easier to interpret. As Andrew said, less abstraction. It also soothes the nerves of those who are startled that the lines for white students appear at the bottom of the chart of gain scores.

I suppose the reason why the original designer chose to use score differentials is to highlight their message concerning change in scores. One can nitpick that their message isn't particularly cogent because if you look at 8th grade math or reading scores, comparing 2009 and 2015, there appeared to be negligible change, and yet between those end-points, the scores did spike and then drop back to the 2009 level.

One way to mitigate the confusion that mankoff encountered in interpreting my gain-score graphic is to use "informative" labels, rather than "uninformative" labels.


Instead of saying the vertical axis plots "gain scores" or "change in scores," directly label one end as "no progress" and the other end as "more progress."

Everything on this chart is progress over time, and the stalling of progress is their message. This chart requires more upfront learning, after which the message jumps out. The chart of raw scores shown above has almost no perceptive overhead but the message has to be teased out. I prefer the chart of raw scores in this case.


Let me now address another objection, which pops up every time I convert a bar chart to a line chart (a type of Bumps chart, which has been called slope graphs by Tufte followers). The objection is that the line chart causes readers to see a trend when there isn't one.

So let me make the case one more time.

Start with the original column chart. If you want to know that Hispanic students have seen progress in their 4th grade math scores grind to a halt, you have to shake your head involuntarily in the following manner:


(Notice how the legend interferes with your line of sight.)

By the time you finish interpreting this graphic, you would have shaken your head in all of the following directions:


Now, I am a scavenger. I collect all these lines and rearrange them into four panels of charts. That becomes the chart I showed in yesterday's post. All I have done is to bring to the surface the involuntary motions readers were undertaking. I didn't invent any trends.