Dataviz Seminar and other upcoming events
Let's not mix these polarized voters as the medians run away from one another

Excel is the graveyard of charts, no!

It's true that Excel is responsible for large numbers of horrible charts. I just came across a typical example recently:

Ewolff_meanmedianincome

This figure comes from Edward Wolff's 2012 paper, "The Asset Price Meltdown and the Wealth of the Middle Class." It's got all the hallmarks of Excel defaults. It's not a pleasing object to look at.

However, it's also true that Excel can be used to make nice charts. Here is a remake:

Redo_meanmedianincome2

This chart is made almost entirely in Excel - the only edit I made outside Excel is to decompose the legend box.

It takes five minutes to make the first chart; it takes probably 30 minutes to make the second chart. That is the difference between good and bad graphics. Excel users: let that be your inspiration!

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Steve Mackenzie

I'm curious why you didn't do this as a simple line chart in Excel? Perhaps I'm missing something?

Ian Watkins

Steve, maybe because there is data missing for most of the years in the timeframe. There is no data included for the 1970's for example.

Ian Watkins

To add to this, a line would give a false idea of the completeness of the data.

jlbriggs

I think it's very useful to periodically remind people that Excel doesn't have to make bad charts.

I'm curious why you didn't also handle the legend issue within Excel as well?

I use Excel to create charts for work all the time, and while there are clearly a lot of shortcomings, it's a very capable tool for creating well designed, attractive, clean charts.

To accomplish the marking each series with it's label, rather than using the legend, I will generally add a data label to the last point, and populate it with the series name. Plenty of other options as well.

OTOH, I also have to wonder why you pulled in the heavy grey background and grid lines from ggplot2, which in my mind have always been as bad a default as many of Excel's choices.

:)

Kaiser

Hi all, here's how the thinking went. First, I wanted to do a line chart. Then, I realized that the years are not regularly spaced. So, I tried a dot plot. The default background is white - I often find white backgrounds too "glaring" and prefer a tint of gray. This has nothing to do with ggplot. I thought about removing the gridlines but decided against it. If I did a line chart, I'd have removed the gridlines. For a dot plot, the gridlines help judge the level shifts. Alternatively, I can fit lines to the 1990-2010 period but that is not a simple facelift in Excel.

Regular Reader

Just asking to understand the intention: Is it a design decision in the remake to not start the y axis at zero? If so does that overemphasis the gap between mean and median - or is that gap the main message that's being conveyed?

Also curious: I realise the intent was to do a super-quick Excel remake but if you had time to take it further: If the message is in fact the gap between the mean and median is there a better way you would recommend to illustrate this - i.e. either with another tool or by pushing Excel further?

Ken

A bit like being able to write nice reports in Word. Yes, you can do it, but you wonder why anyone does it after they have seen what can be done in LaTex much more easily.

Igor

I do realize that the times are a generalization/exaggeration but the second chart can be done in 5min in Excel whilst the first one takes around 30 seconds.

Kaiser

Igor: The times are a little exaggerated - but it always takes shorter if you know exactly what you want to do, but figuring it out is part of the fun!

Jessica

why did you start the y axis at 30?

is there a different way to call out or point to the missing data? i'm worried the reader won't spot this.

The comments to this entry are closed.