« January 2008 | Main | March 2008 »

Playful and exploratory

I share reader Bernard L.'s enthusiasm for this very imaginative chart, courtesy of the graphics people at NYT.  The chart captures the ebb and flow of weekly movie receipts over the last two decades.
The details that particularly interest me include:

  • The addition of area colors (on top of lines) serves to highlight box office successes; this really helps readers sort out the massive amount of data
  • Nicely spaced text (and dots) does not interfere with our reading of the chart
  • The hiding of text for less important films, plus taking advantage of interactivity to show their titles if the reader mouses over the respective areas

All of the above indicate a keen sense of foreground versus background.  Besides, the authors had the good sense to speak of inflation-adjusted box office sales; I'm tired of the movie industry proclaiming higher sales each year when ticket prices are rising, and the population is growing.

This is another chart where more data do not easily translate into better communication (see my guest post at Flowing Data).  While I like the playful nature of the interactive chart, it is left to the reader to discover the information buried in the data, such as the assertion in the header that Oscar-winning films typically take time to attain box-office success while many blockbusters do not Oscars make.

In this presentation, it is challenging to compare the total receipts of one film versus another (this requiring comparing oddly shaped, partially obscured areas).  It is also hard to compare across years since the data is spread out over a lot of space.

There may really be two types of graphics: the one like the example here which is a dictionary and designed for exploration; and the other kind where the designer has selected a subset of the data to make a specific point.

Reference: "The ebb and flow of movies", New York Times, Feb 23 2008.

Color scale

This map from the Economist illustrates pretty well the movement of population from middle America outwards from 2000-6.  The message reaches us despite the large volume of data painted.  (The gray shadow though was more than a little distracting.)
The map piqued my curiosity in two areas:

How did they determine the color scale?  The average change over all counties (6.4%) was obviously used.  Standard deviation was not since the ranges of change were unequal in size.

Was within-county percent change the best criterion?  As is, an 80% drop in a 2,000-people county looks the same as an 80% drop in a 200,000-strong county.

Reference: "The Great Plains drain", Economist, Jan 17 2008.

PS. I am traveling and so posting will be limited.

Ordering and grouping

The Times reported that January retail sales generally disappointed, and consumers showed a preference for discount retailers over department stores.



Taking the bar chart on the right, re-ordering by change in same-store sales, and grouping companies by type of retailer, we can present the data to match the text more closely.  The divergent performance between discount retailers and department stores is readily visible.

Reference: "Weak January dashed retailers' gift-card hopes", Feb 8 2008.



Nick B., who occasionally writes about statistical graphics, found some classic chart junk from a Canadian report on the Afghan army.  Here's one example, together with the junkchart version.Redoafghan_2

Redundancy is an enemy of good graphics, and incongruous redundancy is worse.  Here, troop level is variously described as "total force size", "strength" and "army growth"; the chart on the right uses only the army concept.  The data labels ("47000 Strength"), the axis labels ("50000 Total Force Size"), and the gridlines all germinate from the five grand data points underlying the entire chart!

Another distorting feature is that use of different-sized time intervals, which we space out appropriately on the right chart.

Ultimately, the key message should be growth in the army size, not the absolute number of troops.  The slopes of the line segments encode this information.  Alternatively, a data table can be rather powerful for simple data like this:

Redoafghan2 By what is called the "end state", there would be 70% more troops than those as of December 2007.