Notice the inspired touch of the black circles to trace the outline of Blackberry's market share. They are a guide to experiencing the chart.
I wish they had put the Palm section above Blackberry. In an area chart, the only clean section is the bottom section in which the market share is not cumulated. Given the focus on Blackberry, it's a pity readers have to perform subtractions to tease out the shares.
I also wonder if the black circles should contain Blackberry's market share rather than the year labels.
But I enjoyed this chart. Thanks for producing it.
On Twitter, Joe D. disliked the following chart on the Information is Beautiful blog:
The chart carries a long list of flaws.
The column labeled "%" is probably the most jarring. The meaning of these numbers changes with the color. When pink, they give the proportion of females; when blue, the proportion of males. As the stated purpose of the chart is to explore the male-female balance at different websites, it is a bad decision to fold two dimensions into one. While you're thinking about what I just said, what do you think the percentages in gray mean? Your guess is as good as mine.
Now, I appreciate that the designer uses a margin of error (implicitly), and separated these three sites as representing "equality", even though only one of them has the exact 50/50 split.
Wait, for Orkut (second row), it's 51 percent female, and for Foursquare, it's 52 percent male. The gender is coded in the figurines. You can check that with your magnifying glass.
It gets better.
The list of websites is ordered by increasing polarity but only within the three sections. Logically, the three "equality" sites should sit between the "matriarchy" and the "patriarchy". Pinterest and Reddit, the two most polarized sites, should stand on the edges. On the diagram shown right, I simulated a reader who wants to scan through the list of websites from the most female-oriented (Pinterest) to the most male-oriented (Reddit). It's quite the obstacle course.
Let's get to Joe D.'s issue with the chart. How many people does each figurine represent? It's quite a mouthful. Each figurine represents one percent of the unique visitors at the specific website but only in excess of fifty-percent. In effect, the Facebook figurine represents a huge number of people compared to the figurine of a less popular website like tagged. The designer did not explain the inclusion criteria for websites.
If you didn't get that definition, just ignore the figurines and think of this chart as a bar chart in which the bars start at 50 percent (rather than zero as it should). A standard population pyramid appears to do a better job - just add bars to the left of the diagram and properly align the male and female sections.
As I said before, read the fine print.
Here's the fine print:
If I am not mistaken, the designer applied the gender proportions to the traffic totals to obtain the rightmost column, labeled "million more monthly female or male visitors". The trouble is one number pertains to U.S. visitors while the other pertains to worldwide traffic. By multiplying them, the designer makes an assumption: that gender ratio is equivalent inside and outside the U.S., for every website.
Just to give you a sense of scale, according to this chart, Facebook has an excess of 155 million female visitors per month. According to Comscore, the key provider of such data, Facebook has about 145 million total U.S. visitors in June, 2013. It's not a small deal to mix up the geographies.
This example illustrates what I call "use at your own peril". It's like the surgeon's warning in restaurants in the U.S.: we warn you that drinking alcohol while pregnant could lead to birth defects, but you are free to do whatever you want with this information.
As of this writing, the original chart has thousands of Facebook likes, hundreds of shares on Linkedin and Pinterest, etc.
It appears that a lot of people are enjoying the chart more than Joe and I do.
Finally, here is a sketch of how I would plot this type of data. (U.S. traffic data from Comscore, various months of 2012, where I can find them. Comscore is a fee-based service so it is not easy to find data for the smaller sites unless you have a subscription.)
Robert Kosara takes us back to the 1940s, and an incredible "infographics" project by the Lawrence Livermoore Laboratory. (link) Here is one of the designs:
When did information graphics turn into ‘infographics,’ and when did we
lose the meticulous, well-researched, information-rich graphics for the
sad waste of pixels that calls itself infographic today?
I think one of the key missing pieces is analytics. Most of today's infographics seemingly are a result of treating data as flowers to be arranged. There is little analytical thinking behind what the data mean. Incidentally, that is why the new NYU certificate is not called Certificate in Data Visualization--we wanted to emphasize the importance of analytics next to datavis.
Also, we have an elective designed for people interested in content marketing. The Livermoore Lab project would fall into this category. So do annual reports for corporations, fundraising prospectuses for non-profit organizations, magazines whether commercial or membership, content for web marketing, etc.
*** The other problem is a kind of perversion of measurement. Because so much of this stuff is online, so many pieces are judged by click rates or bounce rates or time on page. The problem with click rates is well known. Headlines of so many online articles are written solely to create clicks. It's gotten to the point that we feel duped by the headlines.
The design may have originated in print, but in all likelihood, it is also uploaded to the Web; the interaction of readers with the online version is much easier to track than the effect of print, leading to the lazy generalization that the Web response would be "similar to" the print response. This is one of my pet peeves: bad data is worse than no data.
A reader sends me to Adam Obeng, who did the dirty work deconstructing a set of charts by the U.S. National Highway Traffic Safety Administration on his blog. Here's an example of these charts:
Aside from the sneaker chart, they concocted a pop stick, a pencil, a tower of Hanoi, etc. These objects are ones I think should be evaluated as art. Adam gamely tells us that the proportions are totally off, and they are both internally and externally inconsitent.
I'll add two small points to Adam's post.
First, these charts pass my self-sufficiency test, that is to say, they did not print the entire data set (just one number here) on the page. Alas, given the distortion identified by Adam, not printing the data means everyone is free to create their own data. Herein lies the problem: there is an argument for allowing a small degree of distortion in exchange for "beauty" but these charts without any data have gone too far.
Second, see Adam's last point (the footnote). The original data is something quite convoluted: “3 out of 4 kids are not as secure in the car as they should be because their car seats are not being used correctly.” (How would they know this, I wonder.) This is a statistic about kids while the picture shows a statistic about their parents (or drivers).
Felix linked to a set of charts about guns in the U.S. (and elsewhere). The original charts, by Liz Fosslien, are found here.
I like the clean style used by Fosslien. Some of the charts are thought-provoking. Many of them may raise more questions than they answer. Here are a few that caught my eye.
A simplistic interpretation would claim that banning handguns is futile, and may even have an adverse impact on murder rate. However, this chart does not reveal the direction of causality. Did some countries ban handguns because they are reacting to higher violence? If that is the case, this chart is confirming that the countries with handgun bans are a self-selected group.
The U.S. is an outlier, both in terms of firearm ownership and firearm homicides. This makes the analysis much harder because the U.S. is really in a class of its own. It's not at all clear whether there is a positive correlation in the cluster below, and even if there is, whether we can draw a straight line up to the U.S. dot is also dubious.
Fosslien is being cheeky to deny us the identity of the other outlier, the country with few firearms but even higher death rate from intentional homicide. These scatter plots are great by the way to show bivariate distributions.
I'd still prefer a line chart for this type of data but this particular paired bar chart works for me as well. The contents of this chart is a shock to me.
A reader, Stephen M., who's a high school math Information Technology teacher in Australia, assigned the following chart to his class as a Junk Charts style assignment. (link to original here)
We have seen racetrack charts before (e.g. here or here), and we have dual racetracks here.
Stephen's class identified the following problems with the chart:
- The group agreed this should be better called a data visualisation than an infographic
- The purpose of the 'infographic' seems to be more on the design/form,
than the function of conveying an understanding of the data
- There seems to be a bit of an optical illusion with the lower upper circle
for the US appearing larger than the upper lower one (we checked, there isn't)
- There are no clear labels to assist. It is an assumption that because
in the heading and the figures, population is on top of donations, that
the lines are the same. The class agreed that country labels would help
to the left of each line start.
- No scale on the lines and where do you measure from/to (especially as the US line is a single line for a proportion of the way
- It's too abstract and the spatial separation of the curves makes comparison difficult.
Wow, that's great critique from the 16-year-olds. They are working on ways to re-make this graphic. One good idea is to collapse the two dimensions into one: per-capita donations.
Another issue with this chart is that the countries are sorted in different ways from one chart to the next. It's really difficult to compare one country to another.
It is also instructive to discuss what the key message is in this data. Why those six countries? What kinds of donations are being counted? Do the counting methodology differ by country? How comparable is the data?
Finally, is this art or is this science?
P.S. [12/2/2012] Stephen noted that another deficiency identified by the students is the lack of sourcing. Indeed, where did the data come from? They think it's the CIA Factbook.
Note: This post is by Aleksey Nozdryn-Plotnicki, who blogs at ThinkDataVis.
On my way to Crete recently, I was flipping
through the in-flight magazine when I stumbled upon this treat. This full-page
piece was about Claire Cock-Starkey’s upcoming (at the time) book, Seeing the
The book sells itself as “Global
Infographics” and the article says it is “swapping dry words for colourful
illustrated visuals”. The baby and the iPhone are pure decoration, but there
are also some information graphics here at the top and the bottom which bear a
Above we have what at first looks
innovative, but is actually a disguised bar chart. That’s fine, but:
Bars have been arched,
challenging our ability to compare them
Outer bars actually have
further to go as the radius and therefore circumference increases. So while
Japan has the lowest percentage, its bar appears to be equally as long as that
of Norway, the largest. In fact, since the values are sorted, for the most part
all bars are the same length and size.
The legend is far larger than
the chart itself, and is what really delivers the information at all. Using
that space for a larger chart and labelling the bars directly (like in a usual
bar chart) might be better.
There is no axis with any ticks
The chart has too many
categorical colours, so knowing what any colour represents requires looking it
up in the legend where the raw data is anyway.
Why this circular shape? I
suspect it was a clock-face for time, but the decoration, presumably informing
our sense of “leisure activity” has removed the clock hands, so the metaphor is
Why does the Norway bar go only
90 degrees around? This seems equivalent to not properly scaling the Y-axis on
a bar chart and leaving copious empty space above. Maybe this is meant to
indicate that even the most leisurely Norwegians only have time for gardening,
being a kite, and drinking at a table.
Consolation points, however,
for taking the time to clearly state what leisure time was defined as in this
At first this looks more like a traditional
bar chart, until you realise that:
Larger data is at the top and
smaller at the bottom, so the data is tied to the blue lines on the left,
rather than the visually-weighty bars on the right. Or maybe the height of the
pyramid is meant to be tied to age at marriage?
Bars are artificially grouped
and forced to be the same length, i.e. Sweden 34.3 and Germany 33.7. This leads
to a “lie factor”.
In any event the data is so
loosely encoded that it can hardly be considered encoded at all. The lines and
the data are both sorted.
It has a non-zero baseline at
roughly 20 or so, a “sin” in bar charts, though you could argue for a non-zero
baseline of around 18 for marriage since you would never expect to see values
Ultimately, what I think we have here
belongs in a genre of its own, perhaps “popcorn infographics”. At the time of writing the one review on
amazon.co.uk reads “Bought this for my 14 yr old - absolutely loves it and
showed friends who were also suitably impressed. Thank you” which says a
lot, and not all negative. Perhaps there
is room for popcorn infographics in this world or perhaps it’s just junk.
Aleksey Nozdryn-Plotnicki an analyst/consultant and data
visualisation blogger at ThinkDataVis.com. He is @alekseynp on Twitter.
Reader Steve S. sent in this article that displays nominations for the "Information is Beautiful" award (link). I see "beauty" in many of these charts but no "information". Several of these charts have appeared on our blog before.
Let's use the Trifecta checkup on these charts. (More about the Trifecta checkup here.)
The topic of this chart is both tangible and interesting. As someone who loves books, I do want to know what genres of books typically win awards.
However, both the data collection and graphical design make no sense.
The data collection problem presents a huge challenge and it's easy to get wrong. The problem is how narrow should a theme be. If it's too narrow, you can imagine every book has its own set of themes. If it's too wide, each theme maps to lots of books. The challenge is how to select the themes such that they have similar "widths". For example, "death" is a very wide theme and lots of books contain it, as indicated by the black lines. "Nanny trust issues" is a very narrow theme, and only one of those books deals with this theme. When there is such a theme, is its lack of popularity due to its narrow definition or due to writers not being interested in it?
The caption of this chart said "Cover stars: Charting 50 years up until 2010, this graphic shows The Beatles to be the most covered act in living memory." If that is the message, a much simpler chart would work a lot better.
Since the height of the chart indicates the number of covers sold in that year, the real information being shown is the boom and bust cycles of the worldwide economy. So, a lot more records were sold in 2005, and then the market tanked in 2008, for example.
That's why the data analyst should think twice before plotting raw data. Most data like these should be adjusted. In this case, you could either compare artists against one another in each year (by using proportions) or you have to do a seasonal and trend adjustment. I also don't see the point of highlighting year-to-year fluctuations. Nor do I understand why only in certain years is the top-rated cover identified by name and laurel wreath.
I talked about this stream graph of 311 calls back in 2010. See the post here.
I featured this set of infographics/pie charts back in 2011. See the post here.
This chart is a variant of the one from New York Times that I discussed here. I like the proper orientation on the NYT's version. The color scheme here may be slightly more attractive.