« October 2010 | Main | December 2010 »

Rightsizing: the graph edition

Reader Alex C. alerted me to this sensible note from Allan Reese, complaining about a piece of marketing by a software vendor (Aptech Systems), shown here:

3dburntime This graph purportedly demonstrates the power of 3-dimensional plots, which presumably is a feature of the software GAUSSplot. As Reese pointed out, it rather unfortunately demonstrated the weaknesses of 3D plots, and it did so amusingly well.

Reese: "Apart from a Daliesque charm as abstract art, I can see nothing professional or commendable in this graph."

Yes, every decision rasps. Reese noted that the chart is named a "3D contour plot" by the vendor and yet we see a paired column chart arranged in an L-shape. The diskettes indicate the average of the 3 column values, which serves to obstruct our view of the underlying data. The legend--which is a palette of colors--plays the redundant role of gridlines. The diskettes take on various colors but the label shows it in orange. The two identical axis labels run in opposite directions, with a third running horizontally atop the color legend. The title announces a comparison of natural and synthetic fabrics, which explains how the eight fabrics were divided into two groups but will be missed easily. We surmise that the chart designer typically reads from top to bottom, right to left given the orientation of the fabric categories.

What most disturbs Reese--and surely anyone who is informed of it--is the optical illusion rendered by the use of three different shades of gray for the three panels. The chart literally creates two duelling images, the cube with rainbow strips crawling all over it, and the corner with two walls with rainbow columns sticking up from the floor. The second image, the intended one, is unstable because it would be hard to create a lighting scheme that would render one wall dark while the other wall is lighted.


Redo_3dburntimeThe following 2D dot plot has no razzle-dazzle but makes the point.

I realized the grouping by synthetic v. natural late and just decided to box the synthetic ones. One can certainly make this a two-panel chart with the synthetics on the left and the naturals on the right.

The use of only 3 samples is highly questionable. This chart shows that the only reasonable conclusion is that Acrylic and Nylon have higher burn times than the rest. With so few samples, it is hard to tell if the remaining fabrics are truly different.


There may be situations where a 3D chart is preferred to a 2D chart but this set of data is certainly not such a situation. The software may in fact produce great 3D charts but this particular chart does not show off the software as the marketers may have hoped. One of the designers most important task is to examine the structure of the data, and "right-size" the chart-- throwing in extra dimensions is often counter-productive.




A not-so-wonderful Bumps chart

The Trifecta checkup requires us to align all three aspects to make a great chart. It is sometimes the case that a wise choice has been made regarding the type of chart, but the other elements are missing. Reader Parker S. sent in an example of such a chart.

This chart created by ESPN illustrates the evolution of the "power ranking" of the San Diego Chargers football team within each 18-week-long season and across multiple years.

Espn_ranking The bumps chart is invented for this type of ranking over time data. And in fact, we are looking at a bumps chart.

But with lots of distractions: the multiple colors (instead of year labels), the dots, the legends, the year selector, no foreground (current season).

Parker couldn't figure out the practical question this chart is supposed to answer (the top corner of the Trifecta).

It seems to me that the more interesting question is how different teams fare from week to week within a given season, rather than how one team fared from week to week over consecutive seasons.

In fact, one of the secrets of the Bumps chart -- the reason why it feels far less cluttered than it has the right to be -- is that no two data points will overlap, that is, for any given week, only one team occupies any particular rank. This simple rule is violated when the same team's rank across multiple seasons is plotted, and thus the chart feels very busy.


It proves impossible to find a source of ESPN power rankings that has all teams for a given season. However, I found something similar at CBS Sportsline, a competitor. Here is their version of the ranking chart:


They got the practical question right but severely under-utilized the form. We can see how the Chargers season is going but have no ability to compare them to other teams.

We can start with the question of visualizing how Chargers and their AFC West compatriots are doing relative to the rest of the league:


The AFC West is a mediocre division this season, with all four teams in the middle of the pack, none in the top quarter of the table. The Chargers started high, plunged and are recovering while the Oakland Raiders have improved over the course of the season.

The Bumps chart is more powerful when the full set of data is plotted, and when the lines are highlighted with reference to the question being answered. Are AFC teams or NFC teams doing better?


The next one highlights the teams that earned the largest change in ranking from week 1 to week 10. The background (gray lines) consists of those teams whose rankings in Week 10 were within 5 places of their initial rankings.


The practical question might be whether Week 1 rankings are a good predictor of Week 10 rankings. The following chart shows that most teams in the top quartile remain there (except San Diego which is coming back, and Dallas which could be coming back too), the bottom-quartile teams also tend to remain there, while not surprisingly, the middle teams don't tend to stay in the middle. The color scheme should be reversed if one wants to highlight the dispersion of the rankings of these middle teams by Week 10.



Making charts beautiful without adding unneeded bits

Reader Dave S. sent me to some very pretty pictures, published in Wired.

Wired_311_1 This chart, which shows the distribution of types of 311 calls in New York City by hour of day, is tops in aesthetics. Rarely have I seen a prettier chart.


The problem: no insights.

When you look at this chart, what message are you catching? Furthermore, what message are you catching that is informative, that is, not obvious?

The fact that there are few complaints in the wee hours is obvious.

The fact that "noise" complaints dominate in the night-time hours is obvious.

The fact that complaints about "street lights" happen during the day is obvious.

There are a few not-so-obvious features: that few people call about rodents is surprising; that "chlorofluorocarbon recovery" is a relatively frequent source of complaint is surprising (what is it anyway?); that people call to complain about "property taxes" is surprising; that few moan about taxi drivers is surprising.

But - in all these cases, there are no interesting intraday patterns, and so there is no need to show the time-of-day dimension. The message can be made more striking doing away with the time-of-day dimension.

The challenge to the "artistic school" of charting is whether they can make clear charts look appetizing without adding extraneous details.


What is seasonal adjustment and why is it used?

Like reader Chris P., I'm underwhelmed by this Wall Street Journal chart showing car sales by the top 5 brands during the past 5 years.


It's a stacked column chart for which I have never found a good use. While it contains a lot of data, readers can truly comprehend only the lowest series (in this case, GM) and the total. For any of the other series, we can never be sure whether the number went up or down because it depends on whether the cumulative total of the series below it went up or down and by how much. We have a lot of ink, a lot of data but almost no information.


Chris pointed out that several WSJ readers have complained about seasonal effects: car sales are not even throughout the year, and so a chart like the one above may encourage readers to make month-on-month comparisons, rather than year-on-year comparisons.

Their point is valid but misplaced. If you look at the GM data over time (the dark red bits), it is clear that the seasonal effect has already been removed, as the trend is rather flat within any given year. What they plotted is the so-called SAAR (seasonally adjusted annual rate). Think of SAAR as the "run rate" for annual car sales. Divide the number by 12, you get the run rate at the monthly level. Now remove the seasonal adjustment, and you get the actual sales for that month.

Trifecta From the point of view of the Trifecta checkup, I have no problems with the business question or the data, but I don't like the graphical construct.

To illustrate the point further, I'm switching to a different data set with a similar structure (I can't find a complete data set for the car sales SAAR). As reader Matthew F. pointed out in his comment on my previous post, the housing starts series published by the Census Bureau is also computed as SAAR. I just need to substitute car brand for region of country, and cars sold for housing starts.

Redo_saar_1 Now, what is seasonal adjustment and why do statisticians play with the data?

In the panel on the right, focus on the top row of charts, which plot the unadjusted data. I have the housing starts separated by region, and within each region, I plotted the annual trend, one line for each year. (I smoothed the lines to bring out the seasonal pattern.)

What you see is that almost every line is an inverted U. This means that no matter what year, and what region, housing starts peak during the summer and ebb during the winter.

So if you compare the June starts with the October starts, it is a given that the October number will be lower than June. So reporting a drop from June to October is meaningless. What is meaningful is whether this year's drop is unusually large or unusually small; to assess that, we have to know the average historical drop between October and June.

Statisticians are looking for explanations for why housing starts vary from month to month. Some of the change is due to the persistent seasonal pattern. Some of the change is due to economic factors or other factors. The reason for seasonal adjustments is to get rid of the persistent seasonal pattern, or put differently, to focus attention on other factors deemed more interesting.

The bottom row of charts above contains the seasonally adjusted data (I have used the monthly rather than annual rates to make it directly comparable to the unadjusted numbers.)  Notice that the inverted U shape has pretty much disappeared everywhere.

Comparing the line for 2009 2008 for the South region (first column) is instructive. The unadjusted line shows October sales below June sales, and the familiar inverted U shape. But was it just a seasonal pattern or was there something else driving sales down? The bottom line shows clearly that after accounting for the seasonality, the number of housing starts was trending down the entire year in 20092008, so indeed something else was going on.

[PS. Contrast this with 2003 when the unadjusted data show the usual inverted U shape but we learn that housing starts actually increased over that year relative to the average year.]


I think people have major problems with this because they think of each number (in the bottom row of charts) as an estimate of the car sales for that month. And they would be right -- we cannot take the seasonally adjusted monthly housing starts as an estimate of the true monthly housing starts.

However, statisticians invented seasonal adjustments for a different purpose. If you are a policy maker, you would like to know if the housing market is healthy or not. There are some factors you can't control, for example, the fact that construction companies are more active in the summer than in the winter. But such factors affect the trend in a major way: every winter, housing starts decline. This means that a decline in housing starts during the winter is not necessarily an indication of a weak housing market. However, if the winter decline is steeper than in a typical year, then the housing market must have weakened. The purpose of seasonal adjustment is to remove the seasonal effect so that the policy maker can see what is happening to housing starts (beyond seasonal effects). To use the adjustment properly, we should look at comparisons, not the individual numbers.


Redo_saar_2b  The next set of charts display the unadjusted (top) and the adjusted (bottom) time series of housing starts from 2000 to 2009.

In the top chart, we see the inverted Us again. All this up and down action distract us from seeing whether housing starts have improved or deteriorated in each region and period of time.

In the bottom chart, we can clearly see that the South has the greatest run up prior to 2005, and then suffered a severe contraction till 2009, ending up almost half the 2000 amount. By contrast, the NorthEast has seen no significant trend over the last 10 years.

The bottom chart is actually a variant of the WSJ stacked column chart; the only difference is that there is no stacking. The total housing starts across all 4 regions is not immediately visible from this chart. It is a trade-off with which I'm willing to live.


As I noted in my comment to Matthew's comment, the SAAR is just 12 times the seasonally adjusted data I plotted above and this means, the SAAR chart will look exactly the same with a different vertical scale.


PS. As pointed out by Joe M., the original version of the chart showing non adjusted monthly rate across 4 regions plotted the data in the wrong order. The new version fixed this problem. Also, the year labels were off by 1.



Late-to-the-gate depression brought to you by the Census

In this space was originally intended a post about seasonal adjustments to time-series data. That now has to wait. Because I am recovering from a bout of late-to-the-gate depression: you know the feeling, having arrived at the airport just in time, you half-run, half-walk to get to the gate, only to learn that the gate has just closed, and all the strain has been for naught.

I didn't miss a flight. I was knocked over by the Census Bureau, mentally exhausted. Any of you who have processed lots of data know this feeling. Just before you decide to publish your results (and thankfully, before and not after), you discovered that the data you analyzed contained such egregious errors as to be nonsensical.

Census_housingstartswebSo I present you the data on "new privately owned housing units started", or commonly known as "housing starts". The offending spreadsheet can be downloaded at the Census Bureau here. (Screen shot on left).

The file contains four sets of data: annual data, raw monthly numbers (not seasonally adjusted), seasonally adjusted monthly numbers, and the seasonal adjustment factors (which is just the ratio of the unadjusted to adjusted numbers).


The shocker: the "seasonally adjusted" series is 10 times as big as the "unadjusted" series. I kid you not. In October 2000, the raw data found 140,000 units of housing started; after adjustment, we magically had 1.5 million units started.

Census_housingstartscompareEither I'm misreading the spreadsheet, or quality control is seriously missing at the Census Bureau.

Since the seasonal adjustment factors were provided, I tried to reconcile the two sets of numbers. Perhaps a factor of 10 adjustment is enough. This caused more headaches.

According to the footnote, the factor is defined as "the ratio of unadjusted housing units started to the seasonally adjusted housing units started". For October 2000, this factor was given as 108, which I took to mean that the adjustment took the raw data down by about 8%.

But the digits wouldn't cooperate. Multiplying or dividing by 10 cannot resolve the fact that the seasonally adjusted "549" is larger than the unadjusted "397".


This is the unglamorous side of doing analytics and working with data. When I recover, I will write that post about seasonal adjustments.



Good and simply better

The New York Times printed an amazing set of graphs about the election results last Sunday. The full version can be found here.


The best thing about this set of charts is recognizing that the aggregate data may obscure differences between subgroups, something I address in Chapter 3 of Numbers Rule Your World. In some cases, the charts go two layers deep, for instance Women 18-29 and Men 18-29.

The organization of the various components of the small multiples effectively splits the groups won by each party. In fact, it's very easy to read off the individual chart titles to describe which subgroups leaned to which party, and by how much.

There is so much in these charts that you can spend an entire afternoon exploring and examining the details.


I wish the charts were made simpler. It's very daunting to process the entire page. In terms of subgroups, what we really care about is the size of the drop-off from the previous election; on this chart, though, the right tail of every single chart seems about the same so one wonders if this were a wipe-out across the board by the Democrats, or did the design decisions obscure the differences between subgroups?

I'd suggest one or more of the following simplifications:

  • Remove the historical time-series, and focus on the change from this election to the last election: the criss-crosses are very distracting.
  • Don't show disaggregated charts unless there is a point, and if so, include a note saying what readers should see


I'm not sure the second split by gender adds much to the story here.

  • Remove the line for the opposite party for each side of the divide, the other line is a mirror image anyway.
  • Or better yet, try plotting the difference (margin) between the two parties instead of plotting two lines
  • Remove the share of voters numbers: I appreciate that they're using unbolded font to indicate this data series; perhaps less is more.
  • Put a bold border around particular charts that readers should pay more attention to, the ones with interesting stories, e.g. the groups in which Democrats made gains irrespective of the overall trend (Liberals, those with better financial situation)

What they have is an excellent chart; I think simplifying it a bit makes it even better.



Loss aversion and faux accuracy

Econ_geoengReader Bernie M. is not a fan of this Economist chart.

The chart was prepared by Aurora Flight Sciences, an aircraft manufacturer, commissioned by a professor who supports the concept of maintaining a fleet to pump sulphuric acid into the stratosphere as a way to induce artificial cooling to counteract human-induced global warming.

The chart appears to compare many different ways of shooting the acid into the skies along two dimensions: cost and altitude.

Bernie wrote:

I find the choice of axes extremely counterintuitive. Altitude one would expect on the y-axis. And mixing up the scatter chart elements with the connected line chart doesn' really help either.

The convention regarding axes is to put the outcome variable on the vertical, and the explanatory variable on the horizontal. Thus, in this case, if the cost of a particular solution is primarily determined by the "altitude" (presumably of where the acid would be released), then the designer has followed convention. It is unfortunate that "altitude" is more intuitively put on the vertical axis, but I suspect that defying convention might cause more confusion.

On the other hand, if altitude and cost are not related to each other but two different metrics to evaluate geoengineering concepts, then Bernie's point is right on - swap the axes!


The use of connected lines for two of the solutions but not the rest is a symptom of what I have called "loss aversion". The horror of leaving some of the data on the cutting floor.

The only mention of altitude in the article refers to Aurora's assertion that it is sufficient to use newly designed aircraft flying at 20-25 kilometers. If that is Aurora's preferred solution, there is little reason to show all the other altitude configurations that are suboptimal.

Perhaps the designer wants to make the point that the Boeing 747 solution is inferior to the Aurora solution because Aurora could design aircraft to fly at 10-15 km at a lower cost?  If so, then the chart is very misleading in not providing a comparable cost for Boeing's solution if required to fly at 20-25 km.

When comparing different entities, it is always a bad idea to treat the entities differently. Comparison is only possible on equal footing.

In fact, I think the chart would be a lot clearer if they dropped the altitude dimension on the floor. For each solution, plot the yearly cost at the optimal altitude selected by the respective engineers. Use a bar chart. With a single dimension, it is much easier to accommodate the very long data labels.

(Now, I'd defer to the geoengineers as to whether the altitude dimension is dispensable. I don't have any expertise in this science. Judging from the Aurora red line, I'm assuming that there can be feasible solutions at all altitudes, which leads me to conclude that altitude isn't all that.)


So what is the biggest problem with this chart? It is the faux accuracy.

Given the tremendous amount of uncertainty surrounding these projected costs, one would expect very big error bars around the cost estimates. Using single dots with no error bars is hard to stomach.



Detached in time and space

A reader sent in this "pie chart" (better called a "donut chart") which summarizes the results of this survey.


My dislike of donut charts has been well documented. Click here.


What I want to discuss is the use of interactivity, a feature of this chart but something that backfires. The underlying data is a 5-level rating of "corporate sentiment" by industry, by country, and over time. That would be 4 dimensions jostling for space on a surface. Obviously, some decisions have to be made as to which dimension to highlight and which to push to the background.

This chart highlights the 5-level ratings using the donut device. All other dimensions are well hidden by the interactive feature. Pressing on the forward/backward buttons reveals the industry dimension. Pressing on the arrow on the top left corner reveals the time dimension. Pressing on the map reveals the country dimension.

The problem with this level of detachment is that readers are obstructed from viewing multiple dimensions at once. For instance, it is very hard to understand the differences in sentiment between different industries, or between different countries, or the change in sentiment over time.


Redo_asiasentiment The version on the right shows, for instance, the distribution of ratings by industry for Q3 2010, and for all Asia combined. This is a rough sketch, and one would want to fix quite a few things: making the sector labels horizontal, reducing the distance between the columns, labeling the ratings 1 as "very positive", ordering the sectors from most positive to least positive, etc.

A chart of ratings by country (aggregate of all industry sectors) would follow the same format. Similarly, one can compare ratings across countries, for a given sector... and this can be replicated 11 times for each sector. Similarly, ratings across industries for any given country.

For comparisons across time, I'd suggest using average ratings rather than keeping track of five proportions. This reduces a lot of clutter that does not improve readers' comprehension of the trends. A line chart would be preferred.


A better way to organize the chart is to start with the types of questions that the reader is likely to want to answer. Clicking on each question (say, compare ratings across industries within a country) would reveal one of the above collections of charts.


Another improvement is to add annotations. For instance, one wonders whether the airlines colluded to all give a 2 rating. It is always a great idea to direct readers' attention to the most salient parts of a chart, especially if it contains a lot of data.