Go to my other blog now
Upcoming Talks #DataViz #ABTesting

One guaranteed to make Stephen Few cry

Vox published this chart:


This sort of chart is, unfortunately, quite common in business circles. Just about the only thing one can read readily from this chart is the overall growth in the plug-in vehicle market (the heights of the columns). 

To fix this chart, start subtracting. First, we can condense the monthly data to quarterly:


This version is a bit less busy but there are still too many colors, and too many things to look at.

Next, we can condense the makes of the vehicles and focus on the manufacturers:


This version is still less busy and more readable. We can now see Chevrolet, Nissan, Toyota, Ford and Tesla being the five biggest manufacturers in this category. All the small brands have been aggregated into the "Others" category. The stacked column chart still makes it hard to know what's going on with each individual brand's share, other than the one brand situated at the bottom of the stack.

Next, we switch to a line chart:


This shows the growth in the overall market, as well as several interesting developments:

  • The growth in the number of competitors in the market especially since 2012
  • The fragmentation of the market. Before mid 2012, Chevrolet was dominating the market. Since then, there are five or six brands splitting the market
  • The first-to-market brands have not been able to sustain their advantage

A smoothed version of the line chart is even more readable:


Graphics is a discipline that often rewards subtracting. Less is more.


In the above discussion, I focused on the Visual aspect of the Trifecta Checkup. This dataset is really difficult to interpret, and I'd  not want to visualize it directly.

The real question we are after is to assess which manufacturer is leading the pack in plug-in vehicles.

There are a number of obstacles in our path. Different makes are being launched at different times, and it takes many months for a new make to establish itself in the market. Thus, comparing one make that just launched with another that has been in the market for twelve months is a problem.

Also, makes are of different vehicle types: compacts, SUVs, sedans, etc. More expensive vehicles will have fewer sales whether they are plug-ins or not.

Thirdly, population grows over time. The analyst would need to establish growth that is above the level of population growth.









Removing the "All" graph would make it possible to write the labels next to the other graphs (and remove the legend).


Sam: ...assuming it's not important for readers to know the total size of the market. If I am paid to produce this chart, I'd have opened this in Illustrator, removed the legend, and placed the text labels next to the lines.


What are the most important things to show here?
1. How fast is the total market growing? (the "All" line)
2. Which manufacturers are more/less successful in participating?

Kaiser's last graph shows these, but the manufacturer results are squeezed down.

I know it's unpopular to suggest graphing one of these on the right axis (e.g. "All") and the rest on the left axis, but if I could only do one graph that's what I'd do here.


@zbicyclist but in what circumstance could you only make one graph? (the original chart takes up a fair amount of space which could be easily used to include two charts instead)

Plotting these values to separate axes on the same chart is not just unpopular, it's unpopular because it's a very bad idea.

I would much prefer to see two charts, one above the other: one for the overall, one for the individual categories.


jlbriggs: try making two charts and you'll realize there is a design choice. On one chart, it's easier to see the market share of each brand against the overall. So it's a question of what you want to emphasize. For this data set, I don't see much of interest in the individual lines, do you?


I can see an argument for showing the comparison between companies, sure.

But I see your point.

Either way, what I certainly would *not* do or condone is plotting the individual lines to a secondary axis that is scaled independently on the same chart as the overall.


The dilemma of showing the ALL line with the individual lines is created because you moved from the stacked bar to the line chart. I don't see anything wrong with stacked bars if you want to emphasize the "parts of a whole" concept, although I agree that the number of segments should be kept to a minimum, as shown in the 3rd chart. So it depends on what you want to emphasize.

Also stacked bars work well in interactive software where you can filter on a segment to see the trend for that segment only or if you want to drill down to the next level of detail, such as manufacturers to models in this case. Also, many people also want to see the values, in which case you wouldn't want to smooth the data.

The comments to this entry are closed.