Through twitter, Antonio Rinaldi sent the following chart that accompanied a New York Timespiece talking about the CPI (inflation index). The article concerns a very important topic--that many middle- to lower-income households have barely any saving after spending on necessities--and only touches upon the issue raised by this chart, which is that the official CPI is an average of prices of a basket of goods, and there is much variability in the price changes of different categories of goods.
I cover this subject in much greater detail in Chapter 7 of Numbersense (link). There are many reasons why the official inflation rate seems to diverge from our own experiences. One of the reasons is that we tend to notice and worry about price increases but we don't notice or take for granted price decreases. In the book, I cover the fascinating subject of the psychology of remembering prices. Obviously, this is a subject of utmost importance if we are to use surveys to understand perceived prices.
The price of a T-shirt (unbranded) has remained the same or may have declined in the last decades. Besides, the chart reveals that phone and accessories, computers and televisions have all enjoyed deflation over the last decade. Actually, much of the "deflation" is due to a controversial adjustment known as "hedonics". This is to claim that part of any price change is attributed to product or technology improvements. So, if you pay the same price today for an HDTV as in the past for a standard definition TV, then in reality, the price you paid today is cheaper than that in the past.
That adjustment is reasonable only to a certain extent. For instance, my cell phone company stuffs my plan with hundreds of unused and unusable minutes so on a per-minute basis, I am sure prices have come down substantially but on a per-used-minute basis, I'm not so sure.
Let's get to what we care about on this blog... the visual. There is one big puzzle embedded in this chart. Look at the line for televisions. It dipped below -100 percent! Like Antonio, many readers should be scratching their heads--did the price of television go negative? did the hedonic adjustment go bonkers?
As an aside, I don't like the current NYT convention of hiding too many axis labels. What period of time is this chart depicting? You'd only find out by reading the label of the vertical axis! I mentioned something similar the other day.
The key to understanding a chart like this is to learn what is being plotted. The first instinct is to think the change in prices over time. A quick glance at the vertical axis label would correct that misunderstanding. It said "Change in prices relative to a 23% increase in price for all items, 2005-2014".
This label is doing a lot of work--probably too much for its inconspicious location and unbolded, uncolored status.
Readers have to know that the official CPI is a weighted average of changes in prices of a specified basket of goods. Some but not all of the components are being graphed.
Then readers have to understand that there is an index of an index. The prices of each "item" (i.e. category or component of the CPI) are indiced to 1984 levels. So the prices of television is first re-indiced to 2005 as the baseline. This establishes a growth trajectory for television. But this is not what is being depicted.
Here is what the chart would have looked like if we plotted the growth of the television index (red), the apparel index and the all-items index (blue).
The blue line reflects the 23% average increase in prices in that 10-year period. Notice that the red line does not exhibit any weirdness--television prices have gone down by 90 percent. It's not negative.
What the designer tried to do is to index this data another time. Think of pulling the blue line down to the horizontal axis, and then see what happens to the gray and red lines.
*** Now, even this index on an index should not present a mathematical curiosity. If all items moved to 1.23 while apparel moved to 1.10, you might compute 110%/123% which is roughly 0.. You'd say the apparel index is 90% of the way to where the all-item index went. Similarly for TVs, you would compute 10%/123% which is 0.08. That would be saying the TV index ended up 8% of where the all-item index landed.
That still doesn't yield -100%. The clue here is that the baseline is zero percent, not 100, not 1.0, etc. So if there is an item that moved in sync with all items, its trajectory would have been horizontal at zero percent. That means that the second index is not a division but a subtraction. So for TV, it's -90% - 23% = -113%. For apparel, it's +10%-23% = -13%.
Even though I reverse-engineered the chart, I don't understand the reason for using subtractions rather than division for the second layer of indicing. It's strange to me to add or subtract the two indices that have different baseline quantities.
Here is the same chart but using division:
I usually avoid telescoping indices. They are more trouble than it's worth. Here is an old post on the same subject.
Back in 2009, I wrote about a failed attempt to visualize regional dialects in the U.S. (link). The raw data came from Bert Vaux's surveys. I recently came across some fantastic maps based on the same data. Here's one:
These maps are very pleasing to look at, and also very effective at showing the data. We learn that Americans use three major words to describe what others might call "soft drinks". The regional contrast is the point of the raw data, and Joshua Katz, who created these maps while a grad student at North Carolina State, did wonders with the data. (Looks like Katz has been hired by the New York Times.)
What more evidence do we need that effective data visualization brings data alive... the corollary being bad data visualization takes the life out of data!
Look at the side by side comparisons of two ways to visualize the same data. This is the "soft drinks" question:
And this is the "caramel" question:
The set of maps referred to in the 2009 post can be found here.
Now, the maps on the left is more truthful to the data (at the zip code level) while Katz applies smoothing liberally to achieve the pleasing effect.
Katz has a poster describing the methodology -- at each location on the map, he averages the closest data. This is why the white areas on the left-side maps disappear from Katz's maps.
The dot notation on the left-side maps has a major deficiency, in that it is a binary element: the dot is either present or absent. We lost the granularity of how strongly the responses are biased toward that answer. This may be the reason why in both examples, several of the heaviest patches on Katz's maps correspond to relatively sparse regions on the left-side maps.
Katz also tells us that his maps use only part of the data. For each point on his maps, he only uses the most frequent answer; in reality, there are proportions of respondents for each of the available choices. Dropping the other responses is not a big deal if the responses are highly concentrated on the top choice but if the responses are evenly split, or well-balanced say among the top two choices, then using only the top choice presents a problem.
I think they tried to simplify the scale but ended up making a mess.
Tufte preaches getting rid of all unnecessary ink but sometimes, you go overboard.
I had a tough time understanding the scale of this chart. In particular, it is hard to figure out what the numbers at the top of the chart represent since all six data labels fall into the middle of the chart. There is no vertical axis, and not enough gridlines to easily see what levels the three white lines represent. All the labelled data fall under the middle gridline. Another question is whether the vertical axis starts at zero.
So I tried drawing in reference lines (first mentally but eventually I needed them physically):
After this, it still took a few minutes to see that the gridlines were set at 25, 50 and 75% so this chart actually starts at zero. Without the axis labels, it's not clear if the vertical axis starts at zero or not! The numbers near the top of the chart are in the seventies.
I am convinced now that the individual charts share the same vertical scale. (Sometimes, putting charts on different scales is preferred, as is the case here.)
To summarize, a number of design elements were taken out of these charts:
Each of these tactics, if done separately, is a best practice. All three together create a barrier to comprehension.
Finally, I should note the hybrid dot and line plot utilized here. It's a clever idea. The lines only appear when there is a rather large swing from one data point to the next, and it neatly draws attention to where the big shifts are.
Carl Bialik used to be the Numbers Guy at Wall Street Journal - he's now with FiveThirtyEight. Apparently, he left a huge void. John Eppley sent me to this set of charts via Twitter.
This chart about Citibike is very disappointing.
Using the Trifecta checkup, I first notice that it addresses a stale question and produces a stale answer. The caption below the chart says "the peak times ... seem to be around 9 am and 6 pm." What a shock!
I sense a degree of meekness in usnig "seem to be". There is not much to inspire confidence in the data: rather than the full statistics which you'd think someone at Citibike has, the chart is based on "a two-day sample last autumn". The number of days is less concerning than the question of whether those two autumn days are representative of the year. Curious readers might want to know what data was collected, how it was collected, and the sample size.
Finally, the graph makes a mess of the data. While the black line appears to be data-rich, it is not. In fact, the blue dots might as well be randomly scattered and connected. As you can see from the annotations below, the scale of the chart makes no sense.
Plus, the execution is sloppy, with a missing data label.
The next chart is not much better.
The biggest howler is the choice of pie charts to illustrate three numbers that are not that different.
But I have to say the chart raises more questions than it answers. I am not an expert in pregnancy but doesn't a pregnant woman's weight include the weight of the baby she's carrying? So the more weight the woman gains, on average, the heavier is her baby. What a shock!
The last and maybe the least is this chart about basketball players in the playoff.
It's the dreaded bubble chart. The players are arranged in a perplexing order. I wonder if there is a natural numbering system for basketball positions (center = #1, etc.), like there is in soccer. Even if there is such a natural numbering system, I still question the decision to confound that system with a complicated ranking of current-year playoff players against all-time players.
Above all, the question being asked is uninteresting, and so the chart is uninformative. A more interesting question to me is whether the best players are playing in this year's playoff. To answer this question, the designer should be comparing only currently active players, and showing the all-time ranks of those players who are playing in the playoffs versus those who aren't.
Reading Alberto Cairo’s fabulous book, The Functional Art, feels like reading my own work. It’s staggering how closely aligned our sensibilities are, notwithstanding our disparate backgrounds, he a data journlist by training, and I a statistician. We probably can finish each other’s sentences—and did at this recent Analytically Speaking webcast (link to clip).
Cairo currently teaches data visualization at the University of Miami; this is after a distinguished career as a data/visual journalist, having won many awards.
The Functional Art is divided into halves, which can be read independently.
The front part is a terrific overview of data visualization concepts. Cairo’s interest is in principles, rather than recipes. The field of data visualization has developed separately under three academic disciplines: design, computer science, and statistics. Inevitably, the work products contain contradictions and much re-invention. Cairo achieves a synthesis of these schools of thought, and this book is the clarion call for more work on unifying the key intellectual threads of the field.
The second half contains a series of interviews with industry luminaries. This section is a unique contribution to the literature, glancing at behind-the-scenes of the craft. Practitioners will find these short pieces illuminating and profitable. It is often a long journey to arrive at the graphic in print. The selection of designers emphasizes mainstream media outlets although the interviewees have wide-ranging views.
Included in these pages are plenty of published data graphics, frequently work that Cairo produced while working for the Brazilian publication, Epoca. These graphics are elaborate and ambitious, and nicely reproduced in color images. They reward detailed study, with attention to composition, narrative structure, chart types, selection of statistics, etc.
There are plenty of books on the market about how to do graphics (Dona Wong, Naomi Robbins, Nathan Yau come to mind.) Cairo’s book is not about doing, but about thinking about charts. Trust me, time spent thinking about charts will make your charts much improved.
I will now describe some sections of the book that particularly hold my interest:
In Chapter 3, Cairo explains the “visualization wheel,” a nice way to visualize the decisions that designers make when creating charts. Each decision is presented as a trade-off between two extremes. For example, a chart can be “light” or “dense.” This axis evokes Tufte’s data-ink ratio. Devices such as this wheel are useful for integrating the diverse viewpoints that coexist in our field. Frequently, these trade-off decisions are made implicitly—but they can really benefit from explicit consideration.
Figure 4.11 is one of the Epoca charts narrating a Brazilian election. Just recently, I linked to Cairo’s blog post about a similar chart. In both, a spider (radar) plot features prominently. On the same chart, you’ll find a nice demonstration of the small-multiples principle. I applaud the publisher of Epoca for supporting such deep data graphics.
Chapter 8 is invaluable in documenting the chart-making process. Trial and error is a key element of this process. Here, Cairo shows some of the earlier drafts of projects that eventually went to publication. This material is similar to what Kevin Quealy shows at his ChartNThings blog about New York Times graphics.
Chapter 9 is one of the more mature discussions of interactive graphics I have seen. Too often, interactivity is reduced to a feature that is layered onto any dataset. It should rightfully be seen as a problem of design.
Figure 10.1 is not strictly speaking a “data” graphic but I love John Grimwade’s visual explanation of the “transatlantic superhighway”.
I highlighted the columns for 1993 and 1996. Visually, the height of one column is twice that of the other column. And yet the axis labels tell us that the difference is 65% versus 62.5%.
The reason for the start-at-zero rule is to avoid exaggerating meaningless differences.
To judge whether a change is meaningful or not, in time-series data like this, we have to use history to understand the general variability in college enrollment rates. Based on what we can see in this data (about 20 years), the college enrollment rate hovers between 60 and 70 percent. There is no data between 0 and 60 percent. Those are irrelevant values for this data series. This is why starting at zero is counterproductive.
Here is the line chart starting at zero:
This display has the unintended effect of squashing meaningful changes over time by inserting a lot of empty space below the line.
For those who don't use an iPhone, what you are staring at is the new keyboard. Is the SHIFT key on or off?
For most of us who use the iPhone, we can't tell you either. It's been confusing and exasperating.
The answer is when the SHIFT key is gray, it is off. When the SHIFT key is white, as shown in the following image, it is ON.
This design plays games with our head. We see all the white letter keys and none of them are pressed so we assume white keys are not pressed. This is especially annoying when we are entering names into a text box. Typically, the app developer would save us a keystroke and pre-press the SHIFT key. But when we see a white SHIFT key, our heads tell us it is not pressed, so our fingers press it to turn in gray, and then we learn that we just turned off the SHIFT key.
Here's the issue. Even after months of using this keyboard, and capitalizing words daily, I still haven't gotten used to it. I keep getting confused and frustrated. The knowledge in my head just won't go away.
This is not a rant. This is a lesson for graphics designers.
Reader and tipster Chris P. found this "death spiral" chart dizzying (link).
It's one of those charts that has conceptual appeal but does not do the data justice. As the name implies, the designer has a strong message, that the arctic sea ice volume has dramatically declined over time. This message is there in the chart but the reader has to work hard to find it.
Why doesn't this spider chart work? We can be more precise.
A big problem is the lack of scalability. This chart looks different every year. If you add an extra year to the chart, you either have to increase the density of the years or you have to drop the earliest year.
Years are not circular or periodic so the metaphor doesn't quite work.
Axis labeling is also awkward. Because of the polar coordinates, the axes are radiating so the numbers run up toward the top but run down toward the bottom.
This specific instance of spider chart benefits from the well-behaved data: the between-year variability is much lower than the within-year variability. As a result, the lines don't cross each other much. If the variability from year to year fluctuates a lot, we would have seen a bunch of noodles.
This is a pity because the designer did very well in aligning two corners of the Trifecta Checkup, namely what is the question and what does the data show? It is a great idea to control for month of year, and look at year to year changes. (A more typical view would be to look at month to month changes and plot one line per year.)
This is an example of a chart that does well on one side of the checkup but the failure is that the graph isn't in tune with the data or the question being addressed.
Whenever I see a spider chart, I want to unroll the spiral and see if a line chart is better. Thus:
The dramatic decrease in Arctic ice volume (no matter the month) is clear as day. You can actually read off the magnitude of the drop. (Try doing that in the spider chart, say between 1978 and 1995.)
This chart still has issues, namely too many colors. One can color the lines by season of the year, like this:
Or switch to a small-multiples set up with three lines per chart and one chart per season.
The seasonal arrangement is not arbitrary. You can see the effect of season by looking at side by side boxplots:
The pattern is UP-DOWN-DOWN-UP.
In fact, a side-by-side boxplot of the data provides a very informative look:
The monthly series is obscured in this view, built into the vertical variability, which we can see is quite stable. The idea of controlling for month is to make it irrelevant. This view emphasizes the year on year decline of the entire distribution.
If you're worried that dropping too much information, the data can be grouped by season as before in a small-multiples setup like this:
Regardless of season, the trend is down.
PS. Alberto reminds me of his post about one example of a spider chart (radar chart) that works. Here's the link. It works because the graphical element is more in tune with the data. While the ice cap data has a linear trend over time, the voting data is all about differences in distribution. Also, the designer is expecting readers to care about the high-level pattern, not about the specifics.