Small data sets present graphing challenges
Want a signed book?

Drugged-up American graphic

Reader Chris P. found this chart on, which is one of those sites that invite anyone to contribute graphics to it:


It looks like the designer has taken Tufte's advice of maximizing data-to-ink ratio too literally. There are many, many things going on in a tight space, which leaves the reader feeling drugged-up and cloudy.

From a cosmetic standpoint, fixing the following would help a lot:

  • Make fonts 1-2 points larger in all cases, especially the text on the left hand side
  • Use colors judiciously to stress the key data. In this version, the trends, which are more interesting, are shown in pale gray while the raw data, which are not very exciting, are shown in loud red. Just flip the gray with the red. 
  • Rethink the American flag motive: is drug abuse a uniquely American phenomenon? Should data about the American people always be accompanied by the American flag?
  • Separately present in two charts the time-series data on total arrests, and the cross-sectional data (2008)

Stars_and_drugs Also, realize that by forcing the data into the 50-star configuration, one arbitrarily decides that the data should be rounded to 2-percent buckets. (see right). 

And always ask the fundamental question: what makes this data tick?


As I explored the data, I noticed various arithmetic problems. For example, the arrests by race analysis is itself split into two parts: White/black/Indian/Asian add up to 100 percent and then Hispanic Latino and Latino non Hispanic add up to 100 percent. In some surveys, Hispanics are counted within whites but that doesn't seem to be the case here. The numbers just don't add up.

Also, adding the types of drugs involved does not yield the total number of arrests. Perhaps the category of "others" has been omitted without comment. Now I closed my eyes and proceeded to make a chart out of this.


The new version focuses on one insight: that certain races seem to get arrested for certain drugs. The relative incidence for arrests are not similar among the races for any given drug. Asians and Native Americans appear to have higher proportions of people arrested for marijuana or meth while blacks are much more likely to be arrested for crack. 


You're going to need to click on the chart for the large version to see the text.

Doing this chart gives me another chance to plug the Profile chart. We deliberately connect with lines the categorical data. The lines are meant to mean anything; they are meant to guide our eyes towards the important features of the chart.

One can sometimes superimpose all the lines onto the same plot but the canvass clogs up quickly with more lines, and then a small-multiples presentation like this one is preferred.

We have a temptation to generalize arrest data to talk about drug habits by race but if you intend to do so, bear in mind that arrests need not correlate strongly with usage.


Feed You can follow this conversation by subscribing to the comment feed for this post.


Great critique. Can I ask what the advantages are of a profile chart over a bar chart? It seems to me that viewers expect a continuous line to imply a continuous series (time, for instance). Wouldn't a bar chart fulfill the same function without the danger of misleading viewers about the x-axis data?

Rick Wicklin

I was confused by the vertical axes on your profile chart. I think the axes you show are proportions. Multiply by 100 if you want to show percentages, to agree with the vertical label.


Chris: If you click on the profile chart archive, either from the link in the post or on the right column, you will find many explanations of why profile charts are better than bar charts. The best reason is that how one reads a bar chart is to trace a line from the top of each bar to the next bar, which is to say one's eyes trace a profile chart.

custom research papers

The people must created methods the fight against drugs. But you created a stst about drugs. It's not right.

The comments to this entry are closed.