« Finding dots | Main | Poll numbers »



The box plot is unsuited to the audience (New York Times readers), and doesn't show time trends. Hard to see this as an improved chart.

Jon Peltier

"One weakness of the above chart is the suppression of temporal information."

This makes the chart nearly useless. In most cases, the time sequence goes from the end of one whisker, through the boxes in order, and to the opposite end of the other whisker. Not at all what Tukey must have thought about quartiles and outliers.


Maybe I'm just a bit slower than most but I'm having a tough time reading those charts. Different colored boxes, dashed lines, thick solid lines ... a lot of visual cues that aren't intuitive. How am I supposed to read those charts? Is there a key I'm missing?

Jorge Camoes

Well, since Jon Peltier is commenting here, perhaps you could check out the last entry in his site: panel charts. Panel charts are just right for this type of data. I uploaded a chart with the auto market share data here. Take a look!


Jorge, the panel chart works nicely. Thanks for sharing.


Jorge's Panel Chart version has my vote for the best approach to tackle this dataset. (Though both it and most of the alternatives proposed would likely be too intimidating to most of the NYT's readers).

A couple of nits to pick with Jorge's attempt.

* First, the chart really needs a clearly labeled vertical axis. All we have is the 20% mark and that doesn't tell us what the other values are.
* Second, the horizontal axis seems to have more ticks that there are data points... The axis label (from 90-95) implies 16 points, the data set actually contains 17, and I count 18 ticks... puzzling.
* Finally, my personal taste would be to not have a separate label on the top of the chart, but rather just label one of the data sets directly. Say the G.M.

For a similar take on this challenge, see http://www.processtrends.com/images/chart_small_multiple_hor_01.gif


This has been a great discussion, and I agree that the panel chart is an attractive option to display the data.

However, it does not directly address the question posed by the article: is the U.S. market becoming "like the European market"?

The reason is that data from older periods are a distraction. To answer that question, we must compare the recent U.S. market shares with recent Europe market shares, and in addition, to show that the U.S. market shares have shifted recently.

There is a general lesson here, which is that sometimes, it is okay to suppress the time dimension. Time is not any different from other variables; if we are willing to collapse other variables, we should be willing to collapse time as well.

Jorge Camoes

Kaiser, I am not sure if we can suppress the time dimension if we include the word "becoming" in our question. We can, however, create an indicator that shows us if the markets are becoming more similar. For example, some years ago, the three larger players in US accounted for almost 75% of the total market. Now, they have less than 50%. Meanwhile, the European market was stable around 40%. You can add a panel to show this trend and, based on this indicator, conclude that the US market is becoming like the European market. The "total market share of the larger players in each market" becomes your indicator of similarity (this is a simple measure that can be understood even by the NYT readers...).

Jorge Camoes

Another solution for a more sofisticaded audience: display the comulative shares in each market (using a Pareto chart or something like that) and use animation to show the trend. I don't have the time to do it myself, bu I am sure the visual effect can be very impressive.

(Take a look at the talk by Hans Rosling in TED (http://www.ted.com/tedtalks/tedtalksplayer.cfm?key=hans_rosling) to see how effective animation can be to show change over the years.)

Allison Sayer

I checked out this website after seeing a mention of it in a major science journal. I am a graduate student and I have an entry level position preparing samples and data for a government scientific agency. I display data all the time, but have never had specific instruction about how to construct figures. I really appreciate the commentary on this website because it will help me to make better figures and to work on them more efficiently. Thanks!


Jorge, you can keep the word "becoming" as long as you have two time periods visible, but two need not be treated as a continuum any more. They can be treated as a comparison pair.

In my work, I once turned a confusing mess of ten years of tick marks into a pair of distributions, by turning all the tick marks for the five most recent years into identical blue ticks, and the five least recent years into identical grey ticks. The forest of ticks collapsed into two overlapping distributions, to which the observer's eye could easily answer the question "are the blue ticks generally distributed higher or lower than the grey ticks?"

I could have destroyed more time information by turning the clusters of blue and grey ticks into a pair of box-and-whisker shapes, one blue and one grey, but that turned out not to be necessary: the trends were clear.


PS as it happens, I would not have been destroying any more information by making boxes and whiskers: literally every year would have been present as a minimum, a maximum, a median, or a quartile :-)

The comments to this entry are closed.

Kaiser Fung. Business analytics and data visualization expert. Author and Speaker.
Visit my website. Follow my Twitter. See my articles at Daily Beast, 538, HBR.

See my Youtube and Flickr.

Book Blog

Link to junkcharts

Graphics design by Amanda Lee

The Read

Keep in Touch

follow me on Twitter