A particular genre of graphics is designed to induce awe: certain bits are allowed to stick out like a sore thumb. Via reader Andre L., and an archive of US Army medical photos and illustrations:
This is a small multiples graph designed to display the somewhat seasonal pattern of deaths due to influenza over years. Basically, we see a U shape in almost every year; however, the height of the peak, and the timing of the peak shows quite a lot of variation. Further, some years exhibit more of an L-shape than U-shape.
But the attention grabber here is the massive peak that occurred between 1918 and 1919. It was unusual in many ways... it was the second big peak during 1918, it occurred late in the year and ellided with the next year's peak. The designer allowed these two components to bleed into the other charts.
From the perspective of scale, readability, cleanliness, this bit sticks out like a sore thumb! But one has to say it is effective.
A log scale is often used to deal with data containing such outliers. But while this makes neater charts, the impact of the orders-of-magnitude difference is lost on the reader, except in her imagination.
The first thought that came to mind after browsing through all the charts was: what a great job they have done to generate interest in food data, which has no right to be entertaining. Specifically, this is a list of things I appreciated:
An obvious effort was undertaken to extract the most thought provoking data out of a massive amount of statistics collected by various international agencies. There weren't any chart that is overstuffed, which is a common problem.
It would be somewhat inappropriate to use our standard tools to critique these charts. Clearly, the purpose of the designer was to draw readers into statistics that they might otherwise not care for.Moreover, the Wired culture
has long traded off efficiency for aesthetics, and this showed in a graph such as this, which is basically a line chart with two lines, and a lot of mysterious meaningless ornaments:
A nice use of a dualline chart, though. It works because both data series share the same scale and only one vertical axis is necessary, which is very subtly annotated here.
The maintenance of the same motifs across several charts is well done. (See the pages on corn, beef, catfish)
It would be nice if Wired would be brave enough to adopt the self-sufficiency principle, i.e. graphs should not contain a copy of the entire data set being depicted. Otherwise, a data table would suffice. The graphical construct should be self-sufficient. This rule is not often followed because of "loss aversion"; there is the fear that a graph without all the data is like an orphan separated from the parents. Since, as I noted, these graphs are mostly made for awe, there is really no need to print all the underlying data. For instance, these "column"-type charts can stand on their own without the data (adding a scale would help).
Not sure if sorting the categories alphabetically in the column chart is preferred to sorting by size of the category. The side effect of sorting alphabetically is that it spreads out the long and the short chunks, which simplifies labelling and thus reading.
Not a fan of area charts (see below). Although it is labelled properly, it is easy at first glance to focus on the orange line rather than the orange area. That would be a grave mistake. The orange line actually plots the total of the two types of fish rearing, not the aquaculture component. The chart is somewhat misleading because it is difficult to assess the growth rate of aquaculture. Much better to plot the size of both markets as two lines (either indiced or not).
Reference: "The Future of Food", Wired, Oct 20 2008.
Loss aversion manifests itself in chart-making, as it does in economics. In chart-marking, loss aversion can be defined as the tendency to avoid losing data at any cost. Given a rich data set, designers often make the mistake of cramming as much data into the chart as possible. This is taking Tufte's concept of maximizing data-ink ratio to the extreme, and it often leads to awkward, muddled charts.
Gelman provided a great example of this recently. See here.
Every piece of data is given equal footing, which results in nothing standing out. The reader gasps for air.
The best evidence is the set of small multiples shown at the bottom. These give the amount of phosphorus flowing into the lake annually since 1973, as measured from four locations.
The point is that the pollution has been most serious on the northern shores, especially in recent years. Thus, the Florida plan focusing on the southern region is likely to make limited impact.
The choice of vertical lines is smart, as the typical time-series connected-line chart would jump up and down crazily. A simple vertical axis marks the amounts, avoiding the temptation to print all the data. The designer realizes it is the trend, rather than individual values, that is the issue.
Taken together, the three components tell a good story. This is a well-executed effort. The Times once again proves itself the leader in developing sophisticated graphics.
Reference: "Florida Deal for Everglades May Help Big Sugar", New York Times, Sep 13 2008.
The cosmos of university ranking got more interesting recently with the advent of the "brain map" by Wired magazine. This new league table counts the total number of winners of five prestigious international prizes (Nobel, Fields, Lasker, Turing, Gairdner) in the past 20 years (up to 2007); and the researcher found that almost all winners were affiliated with American institutions. As discussed before, the map is a difficult graphical object; it acts like a controlling boss. In this brain map, the concentration of institutions in the North American land mass causes over-crowding, forcing the designer to insert guiding lines drawing our attention in myriad directions. These lines scatter the data asunder, interfering with the primary activity of comparing universities.
The chain of dots object cannot stand by itself without an implicit structure (e.g. rows of 10). This limitation was apparent in the hits and misses chart as well. Sticking fat fingers on paper to count dots is frustrating. Simple bars allow readers to compare relative strength with less effort.
In the junkart version, we ditched the map construct completely, retaining only the east-west axis. [For lack of space (and time), I omitted the US East Coast and Washington-St. Louis.] With this small multiples presentation, one can better contrast institutions.
To help comprehend the row structure, I inserted thin strikes to indicate zero awards. A limitation of the ranking method is also exposed: UC-SF has a strong medical school and not surprisingly, it has received a fair share of Nobel (medicine), Lasker and Gairdner prizes; but zero Lasker and Gairdner could be due to less competitive medical schools or none at all!
As this report from the Department of Transportation makes clear, congestion on our roadways causes travellers to add "buffer time" to their planned journeys. So, for instance, one may have to allocate 32 minutes for a trip that would have taken 20 minutes in uncongested traffic if one would like to guarantee on-time arrival. The 12 minutes would either become time spent sitting on the road or wasted time due to arriving too early.
Buffer time can be applied to graphs too. Some graphs require readers to spend time fishing out the information. The chart used to illustrate travel time belongs to this category. The clock analogy fails; in fact, it confuses matters as the hour hand just sits there serving no purpose. The buffer time between staring and comprehending is too much!
Only four numbers underly this chart: travel time when uncongested and buffer time to guarantee on-time arrival, for 1982 and 2001. The following version gets to the point without fuss. It shows that the travel time increased significantly even under uncongested traffic; worse, the buffer time multiplied.
Reducing buffer time is always good but some buffer time may be inevitable. In the traffic analogy, to eliminate all buffer time would mean lots of unused capacity. In the context of graphs, more complicated charts would require more time; the key is whether the reader is rewarded for the time spent figuring out the chart.
Source: "Traffic Congestion and Reliability", Department of Transportation.
Derek C. points us to this effort by a science journalist to use graphs to help "clarify the concept of climate change". The graph on the left shows that actual greenhouse gas emissions have exceeded the level predicted by the most pessimistic climate models. The 3D bar chart on the right examines which countries had most increased emissions since 1990.
While the bar chart contains many of Tufte's "ducks" (not sorted by percent change, 3D, color, gridlines, sufficiency, etc.), it's the left chart that can be made more powerful.
The casual observer does not need to know which model led to which trajectory of predictions; the graph is vastly simplified, and the message much clearer in the junkart version. (I only included the CDIAC data because I didn't locate the EIA numbers.)
The general point here is recognizing what is foreground, and what is background. Aside from gridlines, data labels, axis labels and so on, some of the data usually constitute background material, often as in this case being used to establish comparability.
One message I got out of this chart is that these climate models have done a good job! (Now, I have no idea if part of the curve included the training period. It is curious that the predictions were very narrowly contained in the early 1990s.)
The previous two posts indicated that CNN, TWC and Intellicast had the best on-line weather forecasting accuracy by looking at the median and mean error in predicting daily low and high temperatures over 41 days. Is it possible to differentiate between those three?
For that, we need more data so I switched from summary statistics back to the data. In this new chart, the day by day errors were plotted. The gridlines labelled errors within 5 degrees, which is an arbitrary guideline for acceptable / unacceptable. The three scatters looked remarkably similar although CNN appeared to hit the bull's eye (the middle square) with less bias (errors more evenly distributed) but not much better accuracy overall (similar number of unacceptable errors).
In the comments of the last post on on-line weather forecasts, Hadley raised the evergreen statistical question of mean vs median. In this context, median error is unaffected by particular days in which the forecaster makes extreme errors while mean error takes into account the magnitude of every forecasting error in the sample.
Which one to use depends on the situation. Brandon, who did the original analysis, was motivated by planning a trip to a unfamiliar location. In this case, he might be better served by lower mean error, which would imply few extremely bad forecasts.
On the other hand, if I am interested in my local weather, then I'd likely be less concerned about a few extremely bad forecasts, and more concerned that the forecast is on the money on most days. Then perhaps the median error would come into play.
It turns out it doesn't much matter for our weather forecast data. In this new chart, I superimposed the mean error data (in black). The scatter of points was exactly as it was for median error (in red). (MSN had a particularly bad forecast for a low temperature one day, which pulled its location to the left.)
This shows further that the difference between CNN, Intellicast and The Weather Channel is negligible.
Earlier in the month, Prof. Gelman linked to Brandon's fascinating analysis of on-line weather forecasting accuracy. I have done some additional analysis of the data and the result can be visualized as follows.
I'll concentrate my comments on three observations:
CNN was the clear winner in forecasting accuracy during this period based on two criteria: its median error in forecasting daily lows, and its median error in forecasting daily highs. Moreover, both the median errors were zero, which gives us confidence about its accuracy. The Weather Channel (TWC) and Intellicast (INT) were not far behind.
The ability to forecast highs was better across the board than that of forecasting lows (except BBC). I am not sure why this should be so.
Overall, our weather forecasters were much too risk-averse. Notice that the errors were heavily biased in the lower left quadrant. A negative error on low temperatures means predicted low is higher than actual low; a negative error on high temperatures means predicted high is lower than actual high. Taking these together, we observe that the range of actual temperatures have generally been larger than the range of predicted temperatures! No one was willing to go out on a limb, so to speak, to forecast extremes.
Actually, I believe this inability or unwillingness to forecast extreme values is endemic to all forecasting methodologies.
Before closing, I mention that the graph was based on a subset of Brandon's data. I only considered same-day forecasts, did not consider Unisys (because they didn't provide forecasts for lows), and also noted that there might be bias since there were breaks in the time series. Also, I retained the sign information and didn't take absolute values as Brandon did.