A promising infographic about motorcycle helmets

The New York Times graphics team shows us how to do infographics poster the right way. They recently put up a feature showing how the repeal of helmet laws is linked to increasing vehicle fatalities. The graphic is here.

One of the key charts is this one (second to last screen):


The graphic tells the story, no additional words are needed. (Actually, you'd have to come from the prior page to know that the white vertical line represented the year in which Florida repealed its helmet law.)

Of course, one state does not prove a trend. It appears that other states face the same situation. It would be nicer if they could start this next chart at an earlier time.


I'm surprised by how much these lines fluctuate given that the raw counts are in the hundreds.

I wonder if there is any active debate in Florida or elsewhere as it would appear that the helmet law repeal may have caused hundreds of unnecessary deaths. Have people been coming up with other explanations for the sharp rise in motorcycle fatalities involving those not wearing helmets?

Nothing to see here

Some graphics are made to inform, some to amuse, some to delight. But the following scatter plot makes one wonder why why why...


What does the designer want to say?


I saw this chart inside an infographics titled "Where in the World are the Best Schools and the Happiest Kids?", via the Cool Infographics blog. The horizontal axis is happiness and the vertical axis is average test score.

So it appears that happy kids can get the best and the worst test scores, and kids with the best test scores can be both happy and sad.

That means the happiness of kids does not depend on their test scores.

An inspired picture of Blackberry's dying inspiration

The New York Times has a splendid example of an infographics this weekend, showing the rise and fall of the Blackberry.


Notice the inspired touch of the black circles to trace the outline of Blackberry's market share. They are a guide to experiencing the chart.

I wish they had put the Palm section above Blackberry. In an area chart, the only clean section is the bottom section in which the market share is not cumulated. Given the focus on Blackberry, it's a pity readers have to perform subtractions to tease out the shares.

I also wonder if the black circles should contain Blackberry's market share rather than the year labels.

But I enjoyed this chart. Thanks for producing it.


Use this chart at your own peril

On Twitter, Joe D. disliked the following chart on the Information is Beautiful blog:



The chart carries a long list of flaws.

The column labeled "%" is probably the most jarring. The meaning of these numbers changes with the color. When pink, they give the proportion of females; when blue, the proportion of males. As the stated purpose of the chart is to explore the male-female balance at different websites, it is a bad decision to fold two dimensions into one. While you're thinking about what I just said, what do you think the percentages in gray mean? Your guess is as good as mine.


Now, I appreciate that the designer uses a margin of error (implicitly), and separated these three sites as representing "equality", even though only one of them has the exact 50/50 split.

Wait, for Orkut (second row), it's 51 percent female, and for Foursquare, it's 52 percent male. The gender is coded in the figurines. You can check that with your magnifying glass.

It gets better.

Redo_chicksrule1The list of websites is ordered by increasing polarity but only within the three sections. Logically, the three "equality" sites should sit between the "matriarchy" and the "patriarchy".  Pinterest and Reddit, the two most polarized sites, should stand on the edges. On the diagram shown right, I simulated a reader who wants to scan through the list of websites from the most female-oriented (Pinterest) to the most male-oriented (Reddit). It's quite the obstacle course.

Let's get to Joe D.'s issue with the chart. How many people does each figurine represent? It's quite a mouthful. Each figurine represents one percent of the unique visitors at the specific website but only in excess of fifty-percent. In effect, the Facebook figurine represents a huge number of people compared to the figurine of a less popular website like tagged. The designer did not explain the inclusion criteria for websites.

If you didn't get that definition, just ignore the figurines and think of this chart as a bar chart in which the bars start at 50 percent (rather than zero as it should). A standard population pyramid appears to do a better job - just add bars to the left of the diagram and properly align the male and female sections.


As I said before, read the fine print.

Here's the fine print:

If I am not mistaken, the designer applied the gender proportions to the traffic totals to obtain the rightmost column, labeled "million more monthly female or male visitors". The trouble is one number pertains to U.S. visitors while the other pertains to worldwide traffic. By multiplying them, the designer makes an assumption: that gender ratio is equivalent inside and outside the U.S., for every website.

Just to give you a sense of scale, according to this chart, Facebook has an excess of 155 million female visitors per month. According to Comscore, the key provider of such data, Facebook has about 145 million total U.S. visitors in June, 2013. It's not a small deal to mix up the geographies.

This example illustrates what I call "use at your own peril". It's like the surgeon's warning in restaurants in the U.S.: we warn you that drinking alcohol while pregnant could lead to birth defects, but you are free to do whatever you want with this information.


As of this writing, the original chart has thousands of Facebook likes, hundreds of shares on Linkedin and Pinterest, etc.

It appears that a lot of people are enjoying the chart more than Joe and I do.


Finally, here is a sketch of how I would plot this type of data. (U.S. traffic data from Comscore, various months of 2012, where I can find them. Comscore is a fee-based service so it is not easy to find data for the smaller sites unless you have a subscription.)


Kosara wants to rescue infographics

Robert Kosara takes us back to the 1940s, and an incredible "infographics" project by the Lawrence Livermoore Laboratory. (link) Here is one of the designs:


Kosara laments:

When did information graphics turn into ‘infographics,’ and when did we lose the meticulous, well-researched, information-rich graphics for the sad waste of pixels that calls itself infographic today?

I think one of the key missing pieces is analytics. Most of today's infographics seemingly are a result of treating data as flowers to be arranged. There is little analytical thinking behind what the data mean. Incidentally, that is why the new NYU certificate is not called Certificate in Data Visualization--we wanted to emphasize the importance of analytics next to datavis.

Also, we have an elective designed for people interested in content marketing. The Livermoore Lab project would fall into this category. So do annual reports for corporations, fundraising prospectuses for non-profit organizations, magazines whether commercial or membership, content for web marketing, etc.

The other problem is a kind of perversion of measurement. Because so much of this stuff is online, so many pieces are judged by click rates or bounce rates or time on page. The problem with click rates is well known. Headlines of so many online articles are written solely to create clicks. It's gotten to the point that we feel duped by the headlines.

The design may have originated in print, but in all likelihood, it is also uploaded to the Web; the interaction of readers with the online version is much easier to track than the effect of print, leading to the lazy generalization that the Web response would be "similar to" the print response. This is one of my pet peeves: bad data is worse than no data.


Highway Safety Agency goes rogue

A reader sends me to Adam Obeng, who did the dirty work deconstructing a set of charts by the U.S. National Highway Traffic Safety Administration on his blog. Here's an example of these charts:


Aside from the sneaker chart, they concocted a pop stick, a pencil, a tower of Hanoi, etc. These objects are ones I think should be evaluated as art. Adam gamely tells us that the proportions are totally off, and they are both internally and externally inconsitent.


I'll add two small points to Adam's post.

First, these charts pass my self-sufficiency test, that is to say, they did not print the entire data set (just one number here) on the page. Alas, given the distortion identified by Adam, not printing the data means everyone is free to create their own data. Herein lies the problem: there is an argument for allowing a small degree of distortion in exchange for "beauty" but these charts without any data have gone too far.

Second, see Adam's last point (the footnote). The original data is something quite convoluted: “3 out of 4 kids are not as secure in the car as they should be because their car seats are not being used correctly.” (How would they know this, I wonder.) This is a statistic about kids while the picture shows a statistic about their parents (or drivers).



Interpreting some charts about guns

Felix linked to a set of charts about guns in the U.S. (and elsewhere). The original charts, by Liz Fosslien, are found here.

I like the clean style used by Fosslien. Some of the charts are thought-provoking. Many of them may raise more questions than they answer. Here are a few that caught my eye.


A simplistic interpretation would claim that banning handguns is futile, and may even have an adverse impact on murder rate. However, this chart does not reveal the direction of causality. Did some countries ban handguns because they are reacting to higher violence? If that is the case, this chart is confirming that the countries with handgun bans are a self-selected group.



The U.S. is an outlier, both in terms of firearm ownership and firearm homicides. This makes the analysis much harder because the U.S. is really in a class of its own. It's not at all clear whether there is a positive correlation in the cluster below, and even if there is, whether we can draw a straight line up to the U.S. dot is also dubious.



Fosslien is being cheeky to deny us the identity of the other outlier, the country with few firearms but even higher death rate from intentional homicide. These scatter plots are great by the way to show bivariate distributions.



I'd still prefer a line chart for this type of data but this particular paired bar chart works for me as well. The contents of this chart is a shock to me.



I just don't get this one. Why is there a fan?

Budding graphics connoisseurs from Down Under

A reader, Stephen M., who's a high school math Information Technology teacher in Australia, assigned the following chart to his class as a Junk Charts style assignment. (link to original here)

Behance_donationWe have seen racetrack charts before (e.g. here or here), and we have dual racetracks here.

Stephen's class identified the following problems with the chart:

- The group agreed this should be better called a data visualisation than an infographic

- The purpose of the 'infographic' seems to be more on the design/form, than the function of conveying an understanding of the data

- There seems to be a bit of an optical illusion with the lower upper circle for the US appearing larger than the upper lower one (we checked, there isn't)

- There are no clear labels to assist. It is an assumption that because in the heading and the figures, population is on top of donations, that the lines are the same. The class agreed that country labels would help to the left of each line start.

- No scale on the lines and where do you measure from/to (especially as the US line is a single line for a proportion of the way

- It's too abstract and the spatial separation of the curves makes comparison difficult.


Wow, that's great critique from the 16-year-olds. They are working on ways to re-make this graphic. One good idea is to collapse the two dimensions into one: per-capita donations.

Another issue with this chart is that the countries are sorted in different ways from one chart to the next. It's really difficult to compare one country to another.

It is also instructive to discuss what the key message is in this data. Why those six countries? What kinds of donations are being counted? Do the counting methodology differ by country? How comparable is the data?

Finally, is this art or is this science?

P.S. [12/2/2012] Stephen noted that another deficiency identified by the students is the lack of sourcing. Indeed, where did the data come from? They think it's the CIA Factbook.