« August 2013 | Main | October 2013 »

An inspired picture of Blackberry's dying inspiration

The New York Times has a splendid example of an infographics this weekend, showing the rise and fall of the Blackberry.


Notice the inspired touch of the black circles to trace the outline of Blackberry's market share. They are a guide to experiencing the chart.

I wish they had put the Palm section above Blackberry. In an area chart, the only clean section is the bottom section in which the market share is not cumulated. Given the focus on Blackberry, it's a pity readers have to perform subtractions to tease out the shares.

I also wonder if the black circles should contain Blackberry's market share rather than the year labels.

But I enjoyed this chart. Thanks for producing it.


Seats half full or half empty

Kevin Drum shows the following graphic (link) to illustrate where the House stood on authorizing force in Syria.

What interests me is whether the semi-circle concept adds to the chart. It evokes the physical appearance of a chamber, presumably where such a debate has taken place -- although most televised hearings tend to exhibit lots of empty seats.


The half-filled circles in particular do not make peace with me.

Here is a tree map of the same data.


Notice that legend boxes are unnecessary.

A pie chart with appropriate labeling acts similarly.


A profile chart produces mixed results:


This version has the advantage of stacking the voting variable. It doesn't do a good job describing future scenarios.

Vanity heights and scary charts

Sometimes I wonder if I should just become a chart doctor. Andrew recently wrote that journals should have graphical editors. Businesses also need those, judging from this submission through Twitter (@francesdonald). Link is here.

You don't know whether to laugh or cry at this pie chart:


The author of the article complains that all the tall buildings around the world are cheats: vanity height is defined as the height above which the floors are unoccupied. The sample proportions aren't that different between countries, ranging from 13% to 19% (of the total heights). Why are they added together to make a whole?

The following boxplot illustrates both the average and the variation in vanity heights by region, and tells a more interesting story:


Recall that in a boxplot, the gray box contains the middle 50% of the data and the white line inside the box indicates the median value. UAE has a tendency to inflate the heights more while the other three regions are not much different.


The other graphic included in the same article is only marginally better, despite a much more attractive exterior:


This chart misrepresents the actual heights of the buildings. At first glance, I thought there must be a physical limit to the number of occupied floors since the grayed out sections are equal heights. If the decision has been made to focus on the vanity height, then just don't show the rest of the buildings.

Also, it's okay to assume a minimal intelligence on the part of readers - I mean, is there a need to repeat the "non-occupiable height" label 10 times? Similarly, the use of 10 sets of double asterisks is rather extravagant.


Lunch and talk Wednesday

Numbersense_cover_smI will be the luncheon speaker at INFORMS NYC on Wednesday in NYC. The talk will provide some context for my new book Numbersense (link), and discuss a few examples from the book. You can pre-register here.

INFORMS is the professional society for Operations Research and Management Science people. For some years, I have attended these regularly and learned a lot from other industry speakers.

If you decide at the last minute, you can pay the $5 extra fee on the day of the talk. Or register now.


Junk Charts is featured in an article in Harvard Business Review about data visualization. A few new reviews have appeared: CFA InstituteFlagstaff Business News.


I maintain a list of events on my book blog. Look to the right column.

The incredibly expanding male

It's a mystery to me how there are always people who ignore certain rudimentary rules of graphing data. I'm talking about such clear guidelines as:

  • Bar charts encode data in the heights of the bars -- therefore:
  • You should start each bar at height zero, and
  • You should not vary the width of the bars (unless you are introducing another dimension), and
  • You should space the bars unevenly if your measurement times are unevenly spaced.

I mean, how is it in the year 2013, the BBC shows viewers this? (tip from UK reader Clarke C.)


The chart is absurd on its face. Men did not double in height between 1871 and 1971.  This chart was broadcast in the show "breakfast" which apparently is the BBC UK version of Good Morning America.

I'd just use a line chart. The figurine construct is cute but too much trouble because you have to grow the width while growing the height. If you encode data in the area, then the height is no longer proportional to the real height.

Years ago, we featured something similar: how penguins evolved into humans (link). Curiously, also a gift from British media.

Use this chart at your own peril

On Twitter, Joe D. disliked the following chart on the Information is Beautiful blog:



The chart carries a long list of flaws.

The column labeled "%" is probably the most jarring. The meaning of these numbers changes with the color. When pink, they give the proportion of females; when blue, the proportion of males. As the stated purpose of the chart is to explore the male-female balance at different websites, it is a bad decision to fold two dimensions into one. While you're thinking about what I just said, what do you think the percentages in gray mean? Your guess is as good as mine.


Now, I appreciate that the designer uses a margin of error (implicitly), and separated these three sites as representing "equality", even though only one of them has the exact 50/50 split.

Wait, for Orkut (second row), it's 51 percent female, and for Foursquare, it's 52 percent male. The gender is coded in the figurines. You can check that with your magnifying glass.

It gets better.

Redo_chicksrule1The list of websites is ordered by increasing polarity but only within the three sections. Logically, the three "equality" sites should sit between the "matriarchy" and the "patriarchy".  Pinterest and Reddit, the two most polarized sites, should stand on the edges. On the diagram shown right, I simulated a reader who wants to scan through the list of websites from the most female-oriented (Pinterest) to the most male-oriented (Reddit). It's quite the obstacle course.

Let's get to Joe D.'s issue with the chart. How many people does each figurine represent? It's quite a mouthful. Each figurine represents one percent of the unique visitors at the specific website but only in excess of fifty-percent. In effect, the Facebook figurine represents a huge number of people compared to the figurine of a less popular website like tagged. The designer did not explain the inclusion criteria for websites.

If you didn't get that definition, just ignore the figurines and think of this chart as a bar chart in which the bars start at 50 percent (rather than zero as it should). A standard population pyramid appears to do a better job - just add bars to the left of the diagram and properly align the male and female sections.


As I said before, read the fine print.

Here's the fine print:

If I am not mistaken, the designer applied the gender proportions to the traffic totals to obtain the rightmost column, labeled "million more monthly female or male visitors". The trouble is one number pertains to U.S. visitors while the other pertains to worldwide traffic. By multiplying them, the designer makes an assumption: that gender ratio is equivalent inside and outside the U.S., for every website.

Just to give you a sense of scale, according to this chart, Facebook has an excess of 155 million female visitors per month. According to Comscore, the key provider of such data, Facebook has about 145 million total U.S. visitors in June, 2013. It's not a small deal to mix up the geographies.

This example illustrates what I call "use at your own peril". It's like the surgeon's warning in restaurants in the U.S.: we warn you that drinking alcohol while pregnant could lead to birth defects, but you are free to do whatever you want with this information.


As of this writing, the original chart has thousands of Facebook likes, hundreds of shares on Linkedin and Pinterest, etc.

It appears that a lot of people are enjoying the chart more than Joe and I do.


Finally, here is a sketch of how I would plot this type of data. (U.S. traffic data from Comscore, various months of 2012, where I can find them. Comscore is a fee-based service so it is not easy to find data for the smaller sites unless you have a subscription.)