## Using data tables

##### May 08, 2006

Charts are supposed to elucidate data.  We love charts here but sometimes the love is misplaced.  I noticed the following Economist chart by way of the Truck and Barter blog.

It's a very simple chart, with only 6 pieces of data.  And yet, presenting the data in a table would have been clearer.  One measure of the effectiveness of charts is the amount of time the reader uses to locate the data.  On the table, everything the reader needs require two steps, looking up the right row and the right column.  However, on the bar chart, the reader must first look up the right chart, then the right bar, and then estimate the length of the bar by referencing the axis; if the reader wants the totals, s/he must estimate three lengths and mentally add them up.

Reference: "Into the Fold", Economist, May 4 2006.

You can follow this conversation by subscribing to the comment feed for this post.

If you are interested in the actual values, the table is superior.

If you are interested in the patterns, or the differences between values, the barchart is superior.

I agree with Hadley. Charts and tables are useful for different things - it depends on what needs to be shown. Having both the chart and table covers different needs.

The real problem with the chart is that it repeats the categories. This is because it vertically stacks the two sets of data (which don't share units). I think it would be improved if they were horizontally adjacent. As in the table, this would compact the data & make comparison between income and population trends possible. But, it would be easier to "see" trends (as per Hadley).

Note that I'm not trying to make a general point about tables versus charts. In most situations, charts work better, and point out trends much better.

In this situation, I do not believe the chart adds anything at all. There are no "trends" to be seen with only three marginally related categories.

Another problem with this chart is the use of two variables measured on different scales. By depicting them in same-sized areas, the reader is misled about relative values.

The advantage of charts becomes really clear when there are confidence intervals involved. Comparing several point estimates with associated confidence intervals numerically is hard work, whereas a graphical display makes it trivially easy (or perhaps I should say our visual processing system makes it trivially easy). But in this case, I would categorically favour the table, regardless of the purpose.

I agree with Kaiser here. While what Hadley says is true -- if you are interested in the patterns, or the differences between values, a barchart is superior -- here there are no patterns. Those are categorical variables.

There is another advantage to using tables over figures: tables provide the actual data. Other researchers can look at the same data, analyze it in different ways, test the original paper's conclusions and perhaps draw diferent ones. This is how science should work.

There are no patterns in this data? Excuse me while I choke on my coffee!

From the bar chart I can immediately see which category has the smallest numbers, and the largest. I can not do this from the table without out comparing numbers individually.

Of course, it's great to provide the raw data in a table as well, but a table does not communicate the message in the data.

Being that the Economist is weekly for the general (okay okay -- elite educated) populace --- they still have a 'pretty picture quota' to spatially fill that might be in direct conflict with good display of quant. data.

Well, if you train your eye on the first digit of the numbers, you can pick out the max and min just as easily :)

But every data set has a max and a min, and I wouldn't call that a "pattern" or a "trend".

Your argument works for larger and more complex data sets but not this one unfortunately

For this particular data set, the bar chart implies that there is a relationship between the three categories, possibly that because one is bigger, another is smaller, or that there is a gradient relationship. Bar graphs are great for changes over time, or relationships between related things, but in this case, unregistered immigrants has no actual relationship with underbanked. The table is cleaner. I think the consideration of ease of seeing the bigger/smaller numbers may be answered by the size of the table on the computer screen - in "real life" it would probably be bigger.

The bar graph to that extent gives a false impression. "Presumed similarities are not similar."

Well, running my eye down the first column of digits reveals that unregisted has the largest population ;)

There are clear trends in the data; the trends are hidden in the table, and not clearly displayed in the two charts (it's *not* a single chart). The table is already as good as it's going to get. The chart could be improved by clustering the two series together, and using two axis scales. Data labels can be added to each bar and totals incorporated into axis titles to provide exact values for any reader who desires.

I've presented sample charts here:
http://peltiertech.com/Excel/Commentary/ClusterBarTwoAxes.html

The comments to this entry are closed.