One of my students analyzed the following Economist chart for her homework.
I was looking for it online, and found an interactive version that is a bit different (link). Here are three screen shots from the online version for years 2009, 2013 and 2018. The first and last snapshots correspond to the years depicted in the print version.
The online version is the self-sufficiency test for the print version. In testing self-sufficiency, we want to see if the visual elements (i.e. the circular sectors on the print version) pull their own weights. The quick answer is no. The reader can't tell how much sales are represented in each sector, nor can they reliably estimate the relative scales of print versus ebook (pink/red vs yellow/orange) or year-to-year growth rates.
As usual, when we see the entire data set printed on the chart itself, it is giveaway that the visual elements are mere ornaments.
The online version does not have labels unless you hover over the hemispheres. But again it is a challenge to learn anything from the picture.
In the Trifecta checkup, this is a Type V chart.
This particular dataset is made for the bumps-style chart:
This NYT graphic published on the eve of the Senate elections represents the best of data visualization: it carries its message with a punch.
The link to the web page is here. The graphic proudly occupied the front page of the print edition on Tuesday.
This graphic is not cliched. The typical consequence of such a statement is that it has to come with a reader's manual. The beauty of this beauty is that the required manual is compact:
If you stick to the above, you will do fine.
If you start thinking the height of the area is the chance of winning, you run into trouble.
To turn this into the other style, draw a line through the 50-percent level, erase everything below 50, and then switch from line to area.
On the far right, where it says 75%, you can see that it is precisely half-way between 50 and 100 percent. So the new chart breaks the start-at-zero rule for area charts.
Except... this is an ingenious violation of that rule. Like I said, if you are able to get your head around to thinking that the area maps to lack of competitiveness (or, the amount of lead the leader has, regardless of who's leading), and suppress the urge to interpret the areas as the chance of winning, then the axis starting at 50-percent is not a problem. (I'm assuming that most of these races are in essence two-horse races. If there are more than two viable candidates, this particular chart form doesn't work.)
The payoff is a very compact chart that shows a lot of data in a small space. The NH race was a lock for the Democrats at the start bu the lead kept dwindling so that on the eve of the election, the lead has been cut in half. But the halved chance is still 75 percent in favor of the Dems.
Iowa and Colorado both flipped from Democratic to Republican lead around middle of September.
When the visualization is driven well, the readers have an effortless ride.
I am mystified by the intention behind this chart, published in NYT Magazine (Sept 14, 2014).
It is not a data visualization since the circles were not placed to scale. The 650 and 660 should have been further to the right on a horizontal time scale. And if we were to take the radial time axis literally, the 390 circle would be closest to the center.
It is not a work of art. It doesn’t look particularly appealing. Sometimes, designers are inspired by imagery. The accompanying article concerns windshield wipers, and I’m not seeing the imagery.
The arrangement of the circles actually interfere with the reader’s comprehension. Here is a straightforward version of the data as a column chart.
Now, let’s turn it on the side, with time running vertically instead of horizontal (the convention).
Then, we need to invert convention once again by making the vertical axis run in reverse so that time runs from up to down, instead of down to up.
Finally, distort the frequency axis, replace the bars with circles, and you have essentially replicated the original.
The point is each step obscures the pattern more. In this case, following conventions makes a better chart.
I have a pet peeve about presenting partial data next to complete data, even if it is labeled correctly. On this chart, the number 390 cannot be compared against any of the other numbers because we are not even half way into the decade of the 2010s. Instead of plotting total number of patents per decade, it would have been more useful to plot number of patents per year in each decade. 43, 26, 65, 41, etc. For the 2010s, I am assuming they have data for 3.5 years.
A simple column chart looks like this:
The per-year view shows that the 2010s is unusual. Of course, I should add a footnote to the chart to make it clear that we only have partial data for 2010, and that the assumption behind the averaging is that the pace of patents will remain the same on average for the remainder of the decade.
In the Trifecta Checkup, this is Type DV.
The chart on the top is published, depicting the quite dramatic flattening of the growth in average spending over the last years--average being the total spend divided by the number of Medicare recipients. The other point of the story is that the decline is unexpected, in the literal sense that the Congressional Budget Office planners did not project its magnitude. (The planners did take the projections down over time so they did project the direction correctly.)
Meanwhile, Cairo asked for a chart of total spend, and Kevin Quealy obliged with the chart shown at the bottom. It shows almost straight line growth.
Cairo's point is that the average does not give the full picture, and we should aim to "show all the relevant data".
I want to follow that line of thinking further.
My first reaction is Cairo did not say "show all the data", he said "show the relevant data". That is a crucial difference. For complex social problems like Medicare, and in general, for "Big Data", it is not wise to show all the data. Pick out the data of interest, and focus on those.
A second reaction. How can "relevance" be defined? Doesn't it depend on what the question is? Doesn't it depend on the interests and persuasion of the chart designer (or reader)? One of the key messages I wish to impart in my book Numbersense (link) is that reasonable people using uncontroversial statistical methods to analyze the same dataset can come to different, even opposite, conclusions.
Statistical analysis is concerned with figuring what is relevant and what isn't. This is no different from Nate Silver's choice of signal versus noise. Noise is not just what is bad but also what is irrelevant.
In practice, you present what is relevant to your story. Someone else will do the same. The particular parts of the data that support each story may be different. The two sides have to engage each other, and debate which story has a greater chance of being close to the truth. If the "truth" can be verified in the future, the debate is more easily settled.
Unfortunately, there is no universal standard of relevance.
Going back to the NYT story. The chart on total Medicare spending is not as useful as it may seem. This is because an aggregate metric like this for a social phenomenon is influenced by a multitude of factors. Clearly, population growth is a notable factor here. When they use the word "real", I don't know if this means actualized (as opposed to projected), or "in real terms" (that is, inflation adjusted). If not the latter, the value of money would be another factor affecting our interpretation of the lines.
Without some reference levels for population and value of money, it is hard to interpret whether the straight-line growth implies higher or lower spending intensity. For the second chart, I suggest plotting the growth in the number of Medicare recipients. I believe one of the goals of the Affordable Care Act is to reduce the ranks of the uninsured so a direct depiction of this result is interesting.
The average spend can be thought of as population-adjusted. It is a more interpretable number -- but as Cairo pointed out, it is also narrow in scope. This is a tradeoff inherent in all of statistics. To grow understanding, we narrow the scope; but as we focus, we lose the big picture. So, we compile a set of focal points to paint a fuller picture.
The New York Times Upshot team came up with a dataviz that is worth your time. This is a set of maps that gives a perspective on migration patterns within the US. The metric being portrayed is the birthplace of current residents of each state.
Here is the chart for California:
I see a few smart ideas, starting with the little map on the bottom left. It servies multiple functions. It is a legend mapping colors to four regions of the US. It serves as a visual guide to the definition of regions. It serves as an interactive tool to select states. Readers might remember the use of a pie chart as a legend in my remake of one of the Wikipedia pie charts (link).
The aggregation up to regions is what really makes this chart work. This aggregation reduces the number of pieces from about 50 to about 10.
They also did a great job with the axes and gridlines. Much of the data labels are hidden but the most important numbers are retained. These include the proportion of residents who were born in their home state, the proportion of residents who were born outside the U.S., and any state(s) that contribute a significant portion of residents. In the California example, we see that the proportion of Midwest-born people living in California has declined by a lot over time.
Users can interactively hover over the gridlines to uncover the data labels.
As you scroll through the states, there are some recurring patterns.
Some states clearly have become more desirable over time. Georgia, for instance, has seen strong in-migration (colored pieces) especially from non-Southern states:
This pattern is repeated in other southeastern states, including Virginia, North Carolina and Tennessee.
By contrast, some states are not getting the migrants. As a result, the share of residents born in the home state has increased over time. The Midwestern states have this problem. For instance, Minnesota:
I also find a few states with special features. Nevada has always been a state of migrants:
Wyoming on the other hand has become popular with migrants over time but the composition has shifted away from MidWest states.
I'd have preferred presenting the charts in clusters based on patterns.
I haven't been able to figure out the multi-color spaghetti. I think the undulations are purely for aesthetic reasons.
One way to read the chart, then, is to first see three big patches (light grey for born in current state; white patch for born in other U.S. states; dark gray for born outside the U.S.). Within the white patch, we are looking for the shift between the colors (i.e. regions).
Vox published this chart:
This sort of chart is, unfortunately, quite common in business circles. Just about the only thing one can read readily from this chart is the overall growth in the plug-in vehicle market (the heights of the columns).
To fix this chart, start subtracting. First, we can condense the monthly data to quarterly:
This version is a bit less busy but there are still too many colors, and too many things to look at.
Next, we can condense the makes of the vehicles and focus on the manufacturers:
This version is still less busy and more readable. We can now see Chevrolet, Nissan, Toyota, Ford and Tesla being the five biggest manufacturers in this category. All the small brands have been aggregated into the "Others" category. The stacked column chart still makes it hard to know what's going on with each individual brand's share, other than the one brand situated at the bottom of the stack.
Next, we switch to a line chart:
This shows the growth in the overall market, as well as several interesting developments:
A smoothed version of the line chart is even more readable:
Graphics is a discipline that often rewards subtracting. Less is more.
In the above discussion, I focused on the Visual aspect of the Trifecta Checkup. This dataset is really difficult to interpret, and I'd not want to visualize it directly.
The real question we are after is to assess which manufacturer is leading the pack in plug-in vehicles.
There are a number of obstacles in our path. Different makes are being launched at different times, and it takes many months for a new make to establish itself in the market. Thus, comparing one make that just launched with another that has been in the market for twelve months is a problem.
Also, makes are of different vehicle types: compacts, SUVs, sedans, etc. More expensive vehicles will have fewer sales whether they are plug-ins or not.
Thirdly, population grows over time. The analyst would need to establish growth that is above the level of population growth.
It's a nice small-multiples setup with two tabs, one showing the states in order of descending spend and the other, alphabetical.
In the article itself, they excerpt the top of the chart containing the states that have suspiciously high per-patient spend.
Several types of comparisons are facilitated: comparison over time within each state, comparison of each state against the national average, comparison of trend across states, and comparison of state to state given the year.
The first comparison is simple as it happens inside each chart component.
The second type of comparison is enabled by the orange line being replicated on every component. (I'd have removed the columns from the first component as it is both redundant and potentially confusing, although I suspect that the designer may need it for technical reasons.)
The third type of comparison is also relatively easy. Just look at the shape of the columns from one component to the next.
The fourth type of comparison is where the challenge lies for any small-multiples construction. This is also a secret of this chart. If you mouse over any year on any component, every component now highlights that particular year's data so that one can easily make state by state comparisons. Like this for 2008:
You see that every chart now shows 2008 on the horizontal axis and the data label is the amount for 2008. The respective columns are given a different color. Of course, if this is the most important comparison, then the dimensions should be switched around so that this particular set of comparisons occurs within a chart component--but obviously, this is a minor comparison so it gets minor billing.
I love to see this type of thoughtfulness! This is an example of using interactivity in a smart way, to enhance the user experience.
The Boston subway charts I featured before also introduce interactivity in a smart way. Make sure you read that post.
Also, I have a few comments about the data analysis on the sister blog.
Announcement: I'm giving a free public lecture on telling and finding stories via data visualization at NYU on 7/15/2014. More information and registration here.
The Economist states the obvious, that the current World Cup is atypically high-scoring (or poorly defended, for anyone who've never been bothered by the goal count). They dubiously dub it the Brazil effect (link).
Perhaps in a sly vote of dissent, the graphic designer came up with this effort:
(Thanks to Arati for the tip.)
The list of problems with this chart is long but let's start with the absence of the host country and the absence of the current tournament, both conspiring against our ability to find an answer to the posed question: did Brazil make them do it?
Turns out that without 2014 on the chart, the only other year in which Brazil hosted a tournament was 1950. But 1950 is not even comparable to the modern era. In 1950, there was no knock-out stage. They had four groups in the group stage but divided into two groups of four, one group of three and one group of two. Then, four teams were selected to play a round-robin final stage. This format is so different from today's format that I find it silly to try to place them on the same chart.
This data simply provide no clue as to whether there is a Brazil effect.
The chosen design is a homework assignment for the fastidious reader. The histogram plots the absolute number of drawn matches. The number of matches played has tripled from 16 to 48 over those years so the absolute counts are highly misleading. It's worse than nothing because the accompanying article wants to make the point that we are seeing fewer draws this World Cup compared to the past. The visual presents exactly the opposite message! (Hint: Trifecta Checkup)
Unless you realize this is a homework assignment. You can take the row of numbers listed below the Cup years and compute the proportion of draws yourself. BYOC (Bring Your Own Calculator). Now, pay attention because you want to use the numbers in parentheses (the number of matches), not the first number (that of teams).
Further, don't get too distracted by the typos: in both 1982 and 1994, there were 24 teams playing, not 16 or 32. The number of matches (52 in each case) is correctly stated.
Wait, the designer provides the proportions at the bottom of the chart, via this device:
As usual, the bubble chart does a poor job conveying the data. I deliberately cropped out the data labels to demonstrate that the bubble element cannot stand on its own. This element fails my self-sufficiency test.
I find the legend challenging as well. The presentation should be flipped: look at the proportion of ties within each round, instead of looking at the overall proprotion of ties and then breaking those ties by round.
The so-called "knockout round" has many formats over the years. In early years, there were often two round-robin stages, followed by a smaller knockout round. Presumably the second round-robin stage has been classified as "knockout stage".
Also notice the footnote, stating that third-place games are excluded from the histogram. This is exactly how I would do it too because the third-place match is a dead rubber, in which no rational team would want to play extra-time and penalty shootout.
The trouble is inconsistency. The number of matches shown underneath the chart includes that third-place match so the homework assignment above actually has a further wrinkle: subtract one from the numbers in parentheses. The designer gets caught in this booby trap. The computed proportion of draws displayed at the bottom of the chart includes the third-place match, at odds with the histogram.
Here is a revised version of the chart:
A few observations are in order:
Another reason for separate treatment is that the knockout stage has not started yet in 2014 when this chart was published. Instead of removing all of 2014, as the Economist did, I can include the group stage for 2014 but exclude 2014 from the knockout round analysis.
In the Trifecta Checkup, this is Type DV. The data do not address the question being posed, and the visual conveys the wrong impression.
Finally, there is one glaring gap in all of this. Some time ago (the football fans can fill in the exact timing), FIFA decided to award three points for a win instead of two. This was a deliberate effort to increase the point differential between winning and drawing, supposedly to reduce the chance of ties. Any time-series exploration of the frequency of ties would clearly have to look into this issue.
A graphic illustrating how Americans spend their time is a perfect foil to make the important case that the reader's time is a scarce resource. I wrote about this at the ASA forum in 2011 (link).
The visual form is of a treemap displaying the results of the recently released Time Use Survey results (link to pdf).
What does the designer want us to learn from this chart?
What jumps out first is the importance of various activities, starting with sleep, then work, TV, leisure/sports, etc.
If you read the legend, you'll notice that the colors mean something. The blue activities take up more time in 2013 compared to 2003. Herein, we encounter the first design hiccup.
The size of the blocks (which codes the absolute amount) and the color of the blocks (which codes the relative change in the amount) compete for our attention. According to Bill Cleveland's research, size is perceived more strongly than color. Thus, the wrong element wins.
Next, if we have time on our hands, we might read the data labels. Each block has two labels, the absolute values for 2003 and for 2013. In this, the designer is giving an arithmetic test. The reader is asked to compute the change in time spent in his or her head.
It appears that the designer's key message is "Aging Americans sleep more, work less", with the subtitle "TV remains No.1 hobby".
Now compare the treemap to this set of "boring" bar charts.
This visualization of the same data appears in WSJ online in lieu of the treemap. Here, the point of the article is made clear; the reader needs not struggle with mental gymnastics.
(One can grumble about the red-green color-blindness blindness but otherwise, the graphic is pretty good.)
When I see this sort of data, I like to make a Bumps chart. So here it is:
The labeling of the smaller categories poses a challenge because the lines are so close together. However, those numbers are so small that none of the changes would be considered statistically significant.
From a statistical/data perspective, a very important question must be raised. What is the error bar around these estimates? Is there anything meaningful about an observed difference of fewer than 10 minutes?
Amusingly, the ATUS press release (link to pdf) has a technical note that warns us about reliability of estimates but nowhere in the press release can one actually find the value of the standard error, or a confidence interval, etc. After emailing them, I did get the information promptly. The standard error of one estimate is roughly 0.025-0.05 hours, which means that standard error of a difference is roughly 0.05- 0.1 hours, which means that a confidence interval around any estimated difference is roughly 0.1-0.2 hours, or 6-12 minutes.
Except for the top three categories, it's hard to know if the reported differences are due to sampling.
A further problem with the data is its detachment from reality. There are two layers of averaging going on, once at the population level and once at the time level. In reality, not everyone does these things every day. This dataset is really only interesting to statisticians.
So, in a Trifecta Checkup, the treemap is a Type DV and the bar chart is a Type D.