Budding graphics connoisseurs from Down Under

A reader, Stephen M., who's a high school math Information Technology teacher in Australia, assigned the following chart to his class as a Junk Charts style assignment. (link to original here)

Behance_donationWe have seen racetrack charts before (e.g. here or here), and we have dual racetracks here.

Stephen's class identified the following problems with the chart:

- The group agreed this should be better called a data visualisation than an infographic

- The purpose of the 'infographic' seems to be more on the design/form, than the function of conveying an understanding of the data

- There seems to be a bit of an optical illusion with the lower upper circle for the US appearing larger than the upper lower one (we checked, there isn't)

- There are no clear labels to assist. It is an assumption that because in the heading and the figures, population is on top of donations, that the lines are the same. The class agreed that country labels would help to the left of each line start.

- No scale on the lines and where do you measure from/to (especially as the US line is a single line for a proportion of the way

- It's too abstract and the spatial separation of the curves makes comparison difficult.

***

Wow, that's great critique from the 16-year-olds. They are working on ways to re-make this graphic. One good idea is to collapse the two dimensions into one: per-capita donations.

Another issue with this chart is that the countries are sorted in different ways from one chart to the next. It's really difficult to compare one country to another.

It is also instructive to discuss what the key message is in this data. Why those six countries? What kinds of donations are being counted? Do the counting methodology differ by country? How comparable is the data?

Finally, is this art or is this science?

P.S. [12/2/2012] Stephen noted that another deficiency identified by the students is the lack of sourcing. Indeed, where did the data come from? They think it's the CIA Factbook.


Guest blog: Popcorn infographics

Note: This post is by Aleksey Nozdryn-Plotnicki, who blogs at ThinkDataVis.

***

On my way to Crete recently, I was flipping through the in-flight magazine when I stumbled upon this treat. This full-page piece was about Claire Cock-Starkey’s upcoming (at the time) book, Seeing the Bigger Picture.

Thinkdatavis1

The book sells itself as “Global Infographics” and the article says it is “swapping dry words for colourful illustrated visuals”. The baby and the iPhone are pure decoration, but there are also some information graphics here at the top and the bottom which bear a closer look.

Thinkdatavis2

Above we have what at first looks innovative, but is actually a disguised bar chart. That’s fine, but:

  • Bars have been arched, challenging our ability to compare them
  • Outer bars actually have further to go as the radius and therefore circumference increases. So while Japan has the lowest percentage, its bar appears to be equally as long as that of Norway, the largest. In fact, since the values are sorted, for the most part all bars are the same length and size.
  • The legend is far larger than the chart itself, and is what really delivers the information at all. Using that space for a larger chart and labelling the bars directly (like in a usual bar chart) might be better.
  • There is no axis with any ticks or labels
  • The chart has too many categorical colours, so knowing what any colour represents requires looking it up in the legend where the raw data is anyway.
  • Why this circular shape? I suspect it was a clock-face for time, but the decoration, presumably informing our sense of “leisure activity” has removed the clock hands, so the metaphor is weak.
  • Why does the Norway bar go only 90 degrees around? This seems equivalent to not properly scaling the Y-axis on a bar chart and leaving copious empty space above. Maybe this is meant to indicate that even the most leisurely Norwegians only have time for gardening, being a kite, and drinking at a table.
  • Consolation points, however, for taking the time to clearly state what leisure time was defined as in this data.

Thinkdatavis3b

At first this looks more like a traditional bar chart, until you realise that:

  • Larger data is at the top and smaller at the bottom, so the data is tied to the blue lines on the left, rather than the visually-weighty bars on the right. Or maybe the height of the pyramid is meant to be tied to age at marriage?
  • Bars are artificially grouped and forced to be the same length, i.e. Sweden 34.3 and Germany 33.7. This leads to a “lie factor”.
  • In any event the data is so loosely encoded that it can hardly be considered encoded at all. The lines and the data are both sorted.
  • It has a non-zero baseline at roughly 20 or so, a “sin” in bar charts, though you could argue for a non-zero baseline of around 18 for marriage since you would never expect to see values below that

Ultimately, what I think we have here belongs in a genre of its own, perhaps “popcorn infographics”.  At the time of writing the one review on amazon.co.uk reads “Bought this for my 14 yr old - absolutely loves it and showed friends who were also suitably impressed. Thank you” which says a lot,  and not all negative. Perhaps there is room for popcorn infographics in this world or perhaps it’s just junk.

***

Aleksey Nozdryn-Plotnicki an analyst/consultant and data visualisation blogger at ThinkDataVis.com. He is @alekseynp on Twitter.

 


Can information be beautiful when information doesn't exist?

Reader Steve S. sent in this article that displays nominations for the "Information is Beautiful" award (link). I see "beauty" in many of these charts but no "information". Several of these charts have appeared on our blog before.

Junkcharts_trifecta_checkupLet's use the Trifecta checkup on these charts. (More about the Trifecta checkup here.)

 

Info_beaut_plot_linesThe topic of this chart is both tangible and interesting. As someone who loves books, I do want to know what genres of books typically win awards.

However, both the data collection and graphical design make no sense.

The data collection problem presents a huge challenge and it's easy to get wrong. The problem is how narrow should a theme be. If it's too narrow, you can imagine every book has its own set of themes. If it's too wide, each theme maps to lots of books. The challenge is how to select the themes such that they have similar "widths". For example, "death" is a very wide theme and lots of books contain it, as indicated by the black lines. "Nanny trust issues" is a very narrow theme, and only one of those books deals with this theme. When there is such a theme, is its lack of popularity due to its narrow definition or due to writers not being interested in it?

***

Info_beaut_coversThe caption of this chart said "Cover stars: Charting 50 years up until 2010, this graphic shows The Beatles to be the most covered act in living memory." If that is the message, a much simpler chart would work a lot better.

Since the height of the chart indicates the number of covers sold in that year, the real information being shown is the boom and bust cycles of the worldwide economy. So, a lot more records were sold in 2005, and then the market tanked in 2008, for example.

That's why the data analyst should think twice before plotting raw data. Most data like these should be adjusted. In this case, you could either compare artists against one another in each year (by using proportions) or you have to do a seasonal and trend adjustment. I also don't see the point of highlighting year-to-year fluctuations. Nor do I understand why only in certain years is the top-rated cover identified by name and laurel wreath.

 

***

I talked about this stream graph of 311 calls back in 2010. See the post here.

Info_beaut_311calls

***

I featured this set of infographics/pie charts back in 2011. See the post here.

Info_beaut_refugees

***

This chart is a variant of the one from New York Times that I discussed here. I like the proper orientation on the NYT's version. The color scheme here may be slightly more attractive.

Info_beaut_trackfield

 


The coming commodization of infographics

An email lay in my inbox with the tantalizing subject line: "How to Create Good Infographics Quickly and Cheaply?" It's a half-spam from one of the marketing sites that I signed up for long time ago. I clicked on the link, which led me to a landing page which required yet another click to get to the real thing (link). (Now, you wonder why marketers keep putting things in your inbox!)

Easelly_walkwayThe article was surprisingly sane. The author, Carrie Hill, suggests that the first thing to do is to ask "who cares?" This is the top corner of my Trifecta Checkup, asking what's the point of the chart. Some of us not so secretly hope that answer to "who cares?" is no one.

Carrie then lists a number of resources for creating infographics "quickly and cheaply".

Easel.ly caught my eye. This website offers templates for creating infographics. You want time-series data depicted as a long, hard road ahead, you have this on the right.

You want several sections of multi-colored bubble charts, you have this theme:

Easelly_angel

 

In total, they have 15 ready-made templates that you can use to make infographics. I assume paid customers will have more.

infogr.am is another site with similar capabilities, and apparently for those with some data in hand.

***

Based on this evidence, the avanlanche of infographics is not about to pass. In fact, we are going to see the same styles repetitively. It's like looking at someone's Powerpoint presentation and realizing that they are using the "Advantage" theme (one of the less ugly themes loaded by default). In the same way, we will have a long, winding road of civil rights, and a long, winding road of Argentina's economy, and a long, winding road of Moore's Law, etc.

But I have long been an advocate of drag-and-drop style interfaces for producing statistical charts. So I hope the vendors out there learn from these websites and make your products ten times better so that it is as "quick and cheap" to make nice statistical charts as it is to make infographics.

 


Infographics worthy of the name

The Guardian (via Graphic News) has put out some fantastic infographics posters, so we can't say they are all bad. This is a big collection created in anticipation of the London Olympics. Here's one illustrating the 10,000m race: (link)

Guardian_olympics_10000m

It's nice that they give an overview of the race, plus the calendar. The evolution of men and women times is shown on the same scale. In order to stress the improvement over time, they omitted those years in which the times did not improve (I think, although there are some mysterious omissions of data labels).

They have charts for all the different events and also in water sports, gymnastics, etc.

PS. I do not know why the women's times were omitted from some of the charts (100m, 200m etc.) In those charts, the lines for men are better colored blue to align with the dots on the calendar.

 


Bloomberg issues a health warning dressed up as a fast-food menu

NYC mayor Michael Bloomberg is getting mixed reviews for his proposal to ban super-sized sugary drinks. Reader John O. wasn't impressed with this graphical effort (link):

Bloom_sugarandcalories

 

The key problem: this picture is not scary at all. The reason it's not horrifying is that there is no context. People who have knowledge about healthy eating habits will get the message but that's preaching to the choir.

If you know that the recommended consumption of daily sugars for adults is roughly 20-36 grams, then you can see that one sugary drink of 12 ounces or higher would take you over the daily limit. A 64-ounce drink would give you more than 7 times what you need in a day. That's a powerful message but you won't know it from this chart. Not from the sugar cubes doubling as shadows, which is a cute, creative concept.

Also, make use of the chart-title real estate! Instead of "Sugar & Calories per Fountain Drink", say something memorable. "Fountain drinks make you fat and sick".

***

There is something else fishy about this graphic. What are the most prominent data being displayed?

You got it. They're 7, 12, 16, 32, 64. Where have we seen this type of data display?

Yup. This format is lifted from a menu in a Starbucks or a McDonald's (without prices).

Is this a health warning? Or a restaurant menu?

***

John wrote:

Also slightly confused about the slightly non-linear relationship between calories and drink size.  Maybe volume of ice is held constant...

It is in fact a proportional relationship. The confusion arises from the non-linear increase in cup size from 7 to 64 ounces. The math is roughly 11 calories per ounce, and 3g of sugar per ounce. I wonder if it is better to show those two numbers instead of the ten not-very-memorable numbers shown on the chart itself.

***

In case you're wondering, the heights (thus areas) of the cups have no relationship with any of the data, not calories, not sugars, and not the cup size.

 

PS. John also wrote: "The soda cup graph reminds me of the chart from Pravda that Tufte cites in 'Cognitive Style of Powerpoint'. " If you know what he's talking about, please post a link to the chart. Thanks.


High-effort graphics

Jon Quinton made a chart for Cancer Research UK, which is quite an eyeful.

Cruk_1

The full infographic is here.

Below is a close-up of the key of this chart:

Cruk_2

Jc_returnoneffortWhere would this chart fall in my "return on effort matrix"? It is an extremely high-effort chart; I got tired trying to figure out what all those dimensions mean.

Is it a high-reward or a low-reward chart? It depends on why you're reading the chart. For most readers, I suspect it's low-reward.

***

In my view, the best charts are high-reward, low-effort. I'd emphasize that by effort, I mean effort by the reader. In general, the effort by the chart designer is inversely proportional to that by the reader.

In some special cases, high-effort charts may have high reward justifying the destruction of some brain cells.

Low-effort, low-reward charts are harmless.

More on the return-on-effort matrix here.

***

One simple improvement to a chart like this one is to produce separate charts for men and women. Outside academia, it seems to me almost all use cases for this chart would involve only one gender.

 


Conceptual colors, negative proportions, mysterious axes, and all that

Reader Jordan G. found a different-looking chart on visualizing.org, of which I excerpted the following:

Visualizing_colon

This part comes from the bottom right corner of an entire page of charts (link). The title of the entire project "Gaps in the U.S. Healthcare System" may give some hints as to what the designer was intending to portray. Looking at this part by itself, the reader is missing some information:

  • What do the pink, orange, dark pink colors mean?
  • What's plotted on the vertical scale that are in percentages?
  • The horizontal axis may have something to do with distance/location. It's divided into three sections. Is it a continuous scale (say, kilometres or miles) or is it categorical scale (large, medium noncore)?

Visualizing_legend
The first question is answered by the legend of the post, situated on the far left. Simply by printing the labels for racial groups on this chart, the designer would have saved readers the effort to look for this information.

The second question is not addressed anywhere on the chart but most likely, the percentages represent proportions of adults over 50 years old who ever received the three types of -scopies. The mirrored nature of the vertical axis is odd. As much of the chart is above the zero-proportion line as exists below the line. What does negative proportions mean?

Because I couldn't figure out the answer to the third question, I can't interpret this chart at all. I see that the proportion of adults fluctuates from left to right for every racial group. But with what is the proportion varying? It also appears as if Asians only live in urban areas.

***

Jordan asks a question about color choice here that is worth discussing.

In this chart, the author chooses to use a color coding system to represent race (pink squares = African Americans).  I have seen other charts use “actual” colors to represent race (white = Caucasian, black = African American).  I could see some audiences taking offense at these different color representations, especially where the color chosen has been used pejoratively in the past (like “red” for American Indian, “yellow” for Asian).  What would you consider an effective and appropriate way to encode different races on visualizations?

 


The meaning of most

Megan McArdle started the war on infographics (link). And reader Danielle A. contributes this example, from KissMetrics.

Km_patience

This is one part of a big infographics poster. Needless to say, a bar chart renders this data much better:

Redo_kmpatience

The categories are sensibly sorted, and useless tinges of color removed.

***

But I want to draw attention to their conclusion:

Most participants in the survey would wait 6-10 seconds before they abandon pages.

Now we know writers of opinion pieces in the major newspapers have long lost control over the titles of their pieces. Is it true that graphic designers have ceded control over their conclusion statements?

It would appear so. The category being labeled as "most participants in the survey" accounted for 30% of the respondents. When is 30% considered "most"?

Also, surveys are typically tools for generalization so we expect conclusions about the general population of mobile users. Here, whoever wrote this conclusion timidly restricted the remark to "participants of the survey". This is probably an oversight because in other panels, they talk about x% of consumers or y% of mobile internet users. If the survey was probably designed and executed, they should be confident about the whole population, not just the sample.

Finally, nowhere on this poster can you discover which survey this data came from. We have no idea what the sample size is, nor the margin of error.


The war on infographics

Mcardle_infogrinfographicMegan McArdle (The Atlantic) is starting a war on the infographics plague. (Here, infographics means infographics posters.)  Excellent debunking, and absorbing reading.

It's a long post. Her overriding complaint is that designers of these posters do not verify their data. The "information" shown on these charts is frequently inaccurate, and the interpretation is sloppy.

In the Trifecta checkup framework, this data deficiency breaks the link between the intent of the graphic and the (inappropriate) data being displayed. (Most infographics posters also fail to find the right chart type for the data being displayed.)

While I have often raised similar complaints in the past -- and my current stance is link to good infographics posters only (which explains their scarcity on this blog), one of the significant contributions of the infographics "plague" is the status-hiking of the story-telling perogative. Unfortunately, this plague is yet another case of elevating stories above the data, which (to a lesser extent) is a complaint that Andrew Gelman and I shared about the "Freakonomics" trend. (See here, and Andrew's further comments.)

This doesn't stop McArdle from adding her own contribution to the infographics plague... the poster shown on the right.

Do yourself a favor and read her post in full. Link here.