@apollo_0 on twitter asks me to comment on this, by Scientific Britain:
Here's my comment:
A few weeks ago, the New York Times Upshot team published a set of charts exploring the relationship between school quality, home prices and commute times in different regions of the country. The following is the chart for the New York/New Jersey region. (The article and complete data visualization is here.)
This chart is primarily a scatter plot of home prices against school quality, which is represented by average test scores. The designer wants to explore the decision to live in the so-called central city versus the decision to live in the suburbs, hence the centering of the chart about New York City. Further, the colors of the dots represent the average commute times, which are divided into two broad categories (under/over 30 minutes). The dots also have different sizes, which I presume measures the populations of each district (but there is no legend for this).
This data visualization has generated some negative reviews, and so has the underlying analysis. In a related post on the sister blog, I discuss the underlying statistical issues. For this post, I focus on the data visualization.
One positive about this chart is the designer has a very focused question in mind - the choice between living in the central city or living in the suburbs. The line scatter has the effect of highlighting this particular question.
Boy, those lines are puzzling.
Each line connects New York City to a specific school district. The slope of the line is, nominally, the trade-off between home price and school quality. The slope is the change in home prices for each unit shift in school quality. But these lines don't really measure that tradeoff because the slopes span too wide a range.
The average person should have a relatively fixed home-price-to-school-quality trade-off. If we could estimate this average trade-off, it should be represented by a single slope (with a small cone of error around it). The wide range of slopes actually undermines this chart, as it demonstrates that there are many other variables that factor into the decision. Other factors are causing the average trade-off coefficient to vary so widely.
The line scatter is confusing for a different reason. It reminds readers of a flight route map. For example:
The first instinct may be to interpret the locations on the home-price-school-quality plot as geographical. Such misinterpretation is reinforced by the third factor being commute time.
Additionally, on an interactive chart, it is typical to hide the data labels behind mouseovers or clicks. I like the fact that the designer identifies some interesting locales by name without requiring a click. However, one slight oversight is the absence of data labels for NYC. There is nothing to click on to reveal the commute/population/etc. data for central cities.
In the sister blog post, I mentioned another difficulty - most of the neighborhoods are situated to the right and below New York City, challenging the notion of a "trade-off" between home price and school quality. It appears as if most people can spend less on housing and also send kids to better schools by moving out of NYC.
In the New York region, commute times may be the stronger factor relative to school quality. Perhaps families chose NYC because they value shorter commute times more than better school quality. Or, perhaps the improvement in school quality is not sufficient to overcome the negative of a much longer commute. The effect of commute times is hard to discern on the scatter plot as it is coded into the colors.
A more subtle issue can be seen when comparing San Francisco and Boston regions:
One key insight is that San Francisco homes are on average twice as expensive as Boston homes. Also, the variability of home prices is much higher in San Francisco. By using the same vertical scale on both charts, the designer makes this insight clear.
But what about the horizontal scale? There isn't any explanation of this grade-level scale. It appears that the central cities have close to average grade level in each chart so it seems that each region is individually centered. Otherwise, I'd expect to see more variability in the horizontal dots across regions.
If one scale is fixed across regions, and the other scale is adapted to each region, then we shouldn't compare the slopes across regions. The fact that the lines are generally steeper in the San Francisco chart may be an artifact of the way the scales are treated.
Finally, I'd recommend aggregating the data, and not plot individual school districts. The obsession with magnifying little details is a Big Data disease. On a chart like this, users are encouraged to click on individual districts and make inferences. However, as I discussed in the sister blog (link), most of the differences in school quality shown on these charts are not statistically meaningful (whereas the differences on the home-price scale are definitely notable).
If you haven't already, see this related post on my sister blog for a discussion of the data analysis.
I sketched out this blog post right before the Superbowl - and was really worked up as I happened to be flying into Atlanta right after they won (well, according to any of our favorite "prediction engines," the Falcons had 95%+ chance of winning it all a minute from the end of the 4th quarter!) What I'd give to be in the SuperBowl-winning city the day after the victory!
Maybe next year. I didn't feel like publishing about SuperBowl graphics when the wound was so very raw. But now is the moment.
The following chart came from Orange County Register on the run-up to the Superbowl. (The bobble-head quarterbacks also came from OCR). The original article is here.
The choice of a set of dot plots is inspired. The dot plot is one of those under-utilized chart types - for comparing two or three objects along a series of metrics, it has to be one of the most effective charts.
To understand this type of design, readers have to collect three pieces of information: first is to recognize the dot symbols, which color or shape represents which object being compared; second is to understand the direction of the axis; third is to recognize that the distance between the paired dots encodes the amount of difference between the two objects.
The first task is easy enough here as red stands for Atlanta and blue for New England - those being the team colors.
The second task is deceptively simple. It appears that a ranking scale is used for all metrics with the top ("1st") shown on the left side and the bottom ("32nd") shown on the right. Thus, all 32 teams in the NFL are lined up left to right (i.e. best to worst).
Now, focus your attention on the "Interceptions Caught" metric, third row from the bottom. The designer indicated "Fewest" on the left and "Most" on the right. For those who don't know American football, an "interception caught" is a good defensive play; it means your defensive player grabs a ball thrown by the opposing team (usually their quarterback), causing a turnover. Therefore, the more interceptions caught, the better your defence is playing.
Glancing back at the chart, you learn that on the "Interceptions Caught" metric, the worst team is shown on the left while the best team is shown on the right. The same reversal happened with "Fumbles Lost" (fewest is best), "Penalties" (fewest is best), and "Points Allowed per Game" (fewest is best). For four of nine metrics, right is best while for the other five, left is best.
The third task is the most complicated. A ranking scale always has the weakness that a gap of one rank does not yield information on how important the gap is. It's a complicated decision to select what type of scale to use in a chart like this, and in this post, I shall ignore this issue, and focus on a visual makeover.
I find the nine arrays of 32 squares, essentially the grid system, much too insistent, elevating information that belongs to the background. So one of the first fixes is to soften the grid system, and the labeling of the axes.
In addition, given the meaningless nature of the rank number (as mentioned above), I removed those numbers and used team logos instead. The locations on the axes are sufficient to convey the relative ranks of the two teams against the field of 32.
Most importantly, the directions of all metrics are now oriented in such a way that moving left is always getting better.
While using logos for sports teams is natural, I ended up replacing those, as the size of the dots is such that the logos are illegible anyway.
The above makeover retains the original order of metrics. But to help readers address the key question of this chart - which team is better, the designer should arrange the metrics in a more helpful way. For example, in the following version, the metrics are subdivided into three sections: the ones for which New England is significantly better, the ones for which Atlanta is much better, and the rest for which both teams are competitive with each other.
In the Trifecta checkup (link), I speak of the need to align your visual choices with the question you are trying to address with the chart. This is a nice case study of strengthening that Q-V alignment.
This ABC News chart seemed to have taken over the top of my Twitter feed so I better comment on it.
Someone at ABC News tried really hard to dress up the numbers. The viz is obviously rigged - Obama at 79% should be double the length of Trump's 40% but not even close!
In the Numbersense book (Chapter 1), I played the role of the Devious Admissions Officer who wants to game the college rankings. Let me play the role of the young-gun dataviz analyst, who has submitted the following chart to the highers-up:
I just found out the boss blew the fuse after seeing my chart. The co-workers wore dirty looks, saying without saying "you broke it, you fix it!"
How do I clean up this mess?
Let me try the eye-shift trick.
The solid colors draw attention to themselves, and longer bars usually indicate higher or better so the quick reader may think that Obama is the worst and Trump is the best at ... well, "Favorability on taking office," as the added title suggests.
Next, let's apply the foot-chop technique. This fits nicely on a stacked bar chart.
I wantonly drop 20% of dissenters from every President's data. Such grade inflation actually makes everyone look better, a win-win-win-win-win-win-win proposition. While the unfavorables for Trump no longer look so menacing, I am still far from happy as, with so much red concentrated at the bottom of the chart, eyes are focused on the unsightly "yuge" red bar, and it is showing Trump with 50% disapproval.
I desperately need the white section of the last bar to trump its red section. It requires the foot-ankle-knee-thigh treatment - the whole leg.
Now, a design issue rears its head. With such an aggressive cut, there would be no red left in any of the other bars.
I could apply two cuts, a less aggressive cut at the top and a more aggressive cut at the bottom.
The Presidents neatly break up into two groups, the top three Democrats, and the bottom four Republicans. It's always convenient to have an excuse for treating some data differently from others.
Then, I notice that the difference between Clinton and GW Bush is immaterial (68% versus 65%), making it awkward to apply different cuts to the two neighbors. No problem, I make three cuts.
The chart is getting better and better! Two, three, why not make it five cuts? I am intent on making the last red section as tiny as possible but I can't chop more off the right side of GHW Bush or Reagan without giving away my secret sauce.
The final step is to stretch each bar to the right length. Mission accomplished.
This chart will surely win me some admiration. Just one lingering issue: Trump's red section is still the longest of the group. It's time for the logo trick. You see, the right ends of the last two bars can be naturally shortened.
The logo did it.
Faking charts can take as much effort as making accurate ones.
The ABC News chart encompasses five different scales. For every President, some percentage of dissenters were removed from the chart. The amount of distortion ranges from 15% to 47% of respondents.
The story started with an editor of The Spectator, who went on twitter to make the claim that the map on the right is better than someone else's map on the left:
The raw data are percentages based on counts of voters so the scale is decimal. In general, we discretize the decimal data in order to improve comprehension. Discretizing means we lose granularity. This is often a good thing. The binary map on the left takes the discretization to its logical extreme. Every district is classified as either Brexit (> 50% in favor) or Bremain (> 50% opposed). The map on the right uses six total groups (so three subgroups of Brexit and three subgroups of Bremain.
Then we deal with mapping of numbers to colors. The difference between these two maps is the use of hues versus shades. The binary map uses two hues, which is probably most people's choice since we are representing two poles. The map on the right uses multiple shades of one hue. Alternatively, Alberto favors a "diverging" color scheme in which we use three shades of two hues.
The editor of The Spectator claims that his map is more "true to the data." In my view, his statement applies in these two senses: the higher granularity in the scaling, and also, the fact that there is only one data series ("share of vote for Brexit") and therefore only one color.
The second point relates to polarity of the scale. I wrote about this issue before - related to a satisfaction survey designed (not too well) by SurveyMonkey, one of the major online survey software services. In that case, I suggested that they use a bipolar instead of unipolar scale. I'd rather describe my mood as somewhat dissatisfied instead of a little bit satisfied.
I agree with Alberto here in favor of bipolarity. It's quite natural to underline the Brexit/Bremain divide.
Given what I just said, why complain about the binary map?
We agree with the editor that higher granularity improves comprehension. We just don't agree on how to add graularity. Alberto tells his readers he likes the New York Times version:
This is substantively the same map as The Spectator's, except for 8 groups instead of 6, and two hues instead of one.
Curiously enough, I gave basically the same advice to the Times regarding their maps showing U.S. Presidential primary results. I noted that their use of two hues with no shades in the Democratic race obscures the fact that none of the Democratic primiaries was a winners-take-all contest. Adding shading based on delegate votes would make the map more "truthful."
That said, I don't believe that the two improvements by the Times are sufficient. Notice that the Brexit referendum is one-person, one-vote. Thus, all of the maps above have a built-in distortion as the sizes of the regions are based on (distorted) map areas, rather than populations. For instance, the area around London is heavily Bremain but appears very small on this map.
The Guardian has a cartogram (again, courtesy of Alberto's post) which addresses this problem. Note that there is a price to pay: the shape of Great Britain is barely recognizable. But the outsized influence of London is properly acknowledged.
This one has two hues and four shades. For me, it is most "truthful" because the sizes of the colored regions are properly mapped to the vote proportions.
A friend asked me to comment on the following chart:
Specifically, he points out the challenge of trying to convey both absolute and relative metrics for a given data series.
This chart presents projections of growth in the U.S. mobile display advertising market. It is specifically pointing out that the programmatic segment of this market is growing rapidly (visualized as the black columns).
The blue and red lines then make a mess of the situation. Even though both of these lines espress percentages, they report to different scales. The red line represents growth rates while the blue line represents share of market.
Both of these metrics are relative metrics useful for interpreting the trend. The growth rates (red) interpret the dollar values on the basis of past values while the market shares (blue) interpret the dollar values on the basis of the total market.
It is rarely a good idea to have many scales on the same canvas. Looking at the blue line for the moment, it is shocking to find that the values depicted almost doubled from one end to the other end. The blue line appears much too gentle.
In the makeover, I expressed everything in the same scale (billions of dollars). I used side-by-side charts (small multiples) to isolate each trend that is found in the data. I allow readers to look at each individual segment of the market, and then examine how the individual trends affect the total market.
One might argue that the stacked column chart by itself is sufficient. If there is a severe space limitation, I'd let go of the other two panels. However, having those panels makes the messages easier to obtain. This is particularly true of the steady growth assumption behind the programmatic spending trend (the orange columns).
Reader Jeffrey S. saw this graphic inside a Dec 2 tweet from the National Weather Service (NWS) in Phoenix, Arizona.
In a Trifecta checkup (link), I'd classify this as Type QV.
The problems with the visual design are numerous and legendary. The column chart where the heights of the columns are not proportional to the data. The unnecessary 3D effect. The lack of self-sufficiency (link). The distracting gridlines. The confusion of year labels that do not increment from left to right.
The more hidden but more serious issue with this chart is the framing of the question. The main message of the original chart is that the last two years have been the hottest two years in a long time. But it is difficult for readers to know if the differences of less than one degree from the first to the last column are meaningful since we are not shown the variability of the time series.
The green line makes an assertion that 1981 to 2010 represents the "normal". It is unclear why that period is normal and the years from 2011-5 are abnormal. Maybe they are using the word normal in a purely technical way to mean "average." If true, it is better to just say average.
For this data, I prefer to see the entire time series from 1981 to 2015, which allows readers to judge the variability as well as the trending of the average temperatures. In the following chart, I also label the five years with the highest average temperatures.
It's gratifying to live through the incredible rise of statistics as a discipline. In a recent report by the American Statistical Association (ASA), we learned that enrollment at all levels (bachelor, master and doctorate) has exploded in the last 5-10 years, as "Big Data" gather momentum.
But my sense of pride takes a hit while looking at the charts that appear in the report. These graphs demonstrate again the hegemony of Excel defaults in the world of data visualization.
Here are all five charts organized in a panel:
Chart #5 (bottom right) catches the eye because it is the only chart with two lines instead of three. You then flip to the prior page to find the legend. The legend tells you the red line is Bachelor and the green line is PhD. That seems wrong, unless biostats departments do not give out Master degrees.
This is confirmed by chart #2, where we find the blue line (Master) hugging zero.
Presumably the designer removed the blue line from chart #5 because the low counts mean that it fluctuates wildly between 0 and 100 percent and so disrupts the visual design. But the designer forgets to tell readers why the blue line is missing.
It turns out the article itself contradicts all of the above:
For biostatistics degrees, for which NCES started providing data specifically in 1992, master’s degrees track the overall increase from 2010– 2014 at 47%...The number of undergraduate degrees in biostatistics remains below 30.
In other words, the legend is mislabeled. The blue line represents Bachelor while the red line, Master. (The error was noticed after the print edition went out because the online version has the correct legend.)
There is another mystery. Charts #2, #3, and #5, all dealing with biostats, have time starting from 1992, while Charts #1 and #4 starts from 1987. The charts aren't lined up in a way that would allow comparisons across time.
Similarly, the vertical scale of each chart is different (aside from Charts #3 and #4). This design choice impairs comparison across charts.
In the article, it is explained that 1992 was when the agency started collecting data about biostatistics degrees. Between 1987 and 1992, were there no biostatistics majors? were biostatistics majors lumped into the counts of statistics majors? It's hard to tell.
While Excel is a powerful tool that has served our community well, its flexibility is often a source of errors. The remedy to this problem is to invest ample time in over-riding pretty much every default decision in the system.
This chart, a reproduction of Chart #1 above, was entirely produced in Excel.
The following chart caught my eye when it appeared in the Wall Street Journal this month:
This is a laborious design; much sweat has been poured into it. It's a chart that requires the reader to spend time learning to read.
A major difficulty for any visualization of this dataset is keeping track of the two time scales. One scale, depicted horizontally, traces the dates of Fed meetings. These meetings seem to occur four times a year except in 2012. The other time scale is encoded in the colors, explained above the chart. This is the outlook by each Fed committee member of when he/she expects a rate hike to occur.
I find it challenging to understand the time scale in discrete colors. Given that time has an order, my expectation is that the colors should be ordered. Adding to this mess is the correlation between the two time scales. As time treads on, certain predictions become infeasible.
Part of the problem is the unexplained vertical scale. Eventually, I realize each cell is a committee member, and there are 19 members, although two or three routinely fail to submit their outlook in any given meeting.
Contrary to expectation, I don't think one can read across a row to see how a particular member changes his/her view over time. This is because the patches of color would be less together otherwise.
After this struggle, all I wanted is some learning from this dataset. Here is what I came up with:
There is actually little of interest in the data. The most salient point is that a shift in view occurred back in September 2012 when enough members pushed back the year of rate hike that the median view moved from 2014 to 2015. Thereafter, there is a decidedly muted climb in support for the 2015 view.
This is an example in which plotting elemental data backfires. Raw data is the sanctuary of the incurious.
First, I saw Alberto tweet his design for the Wall Street Journal (below is the English version):
The yellow space is the size of the smallest "livable" apartment in Hong Kong, known as the "mosquito" apartment. Livability is defined by the real estate developers.
If you've lived in a tropical area like Hong Kong, you'll understand the obsession with mosquitoes. The itching for days! The sneaky little things that suck your blood!
In Manhattan, it seems like we prefer saying the shoebox apartment. By comparison, it's not that scary. It's larger in size too.
The graphic is fantastic as it offers various comparisons of everyday spaces, like a NYC parking space and a basketball court, for which many Americans have some sense of their proportion.
This chart leads me down an unexpected path. I found a set of very powerful photos, commissioned by a humanitarian association in Hong Kong. Overwhelming. Here's one:
Yes, that is the entire living space for this family. All of forty square feet.
This article describes the project, as well as links to a number of other equally astounding photos.
These photos are unfair competition for any graphic designer.
Finally, I came across an inspiring, ingenious design. Gary Chang, who is an architect in Hong Kong, created his own apartment (344 square feet, almost 10 times larger than that in the photo, and twice as large as the mosquito apartment) in this amazing, space-saving design.
Through a series of movable walls, and beds, his apartment can be configured in 24 different ways. This is a small multiples layout!
Here is an article about his achievement, together with a video tour of his home. Not to be missed. It defines making something out of nothing.
Here is a little graphic describing certain transformations: