## The best way to handle two dimensions may be to not use two dimensions

##### Mar 06, 2011

Guess what the designer at Nielsen wanted to tell you with this chart:

Reader Steven S. couldn't figure it out, and chances are neither can you.

• The smartphone (OS) market is dominated by three top players (Android, Apple and Blackberry) each having roughly 30% share, while others split the remaining 10%.
• The age-group mix for each competitor is similar (or are they?)

Maybe those are the messages; if so, there is no need to present a bivariate plot (the so-called "mosaic" plot, or in consulting circles, the Marimekko). Having two charts carrying one message each would accomplish the job cleanly.

***

Trying to do too much in one chart is a disease; witness the side effects.

The two columns, counting from the right, contain rectangles that appear to be of different sizes, and yet the data labels claim each piece represents 1%, and in some cases "< 1%".  The simultaneous manipulation of both the height and the width plays mind tricks.

Also, while one would ordinarily applaud the dropping of decimals from a chart like this, doing so actually creates the ugly problem that the five pieces of 1% (on the left column shown here) have the same width but clearly varying heights!

What about this section of the plot shown on the left? Does the smaller green box look like it's less than 1/3 the size of the longer green box? This chart is clearly not self-sufficient, and as such one might prefer a simple data table.

The downfall of the mosaic plot is that it gives the illusion of having two dimensions but only an illusion: in fact, the chart is dominated by one dimension, as all proportions are relative to the grand total.

For instance, the chart says that 6% of all smartphone users are between the ages of 18 and 24 AND uses an Android phone. It also tells us that 2% of all smartphone users are between 35 and 44 AND uses a Palm phone. Those are not two numbers anyone would desire to compare. There are hardly any practical questions that require comparing them.

Sometimes, the best way to handle two dimensions is not to use two dimensions.

***

The original article notes that "Of the three most popular smartphone operating systems, Android seems to attract more young consumers." In the chart shown below,  we assume that the business question is the relative popularity of phone operating systems across age groups.

The right metric for comparison is the market share of each OS within an age group.

For example, tracing the black line labeled "Android", this chart tells us that Android has 37% of the 18-24 market while it has about 20% of the 65 and up market.

Android has an overall market share of about 30%, and that average obscures a youth bias that is linear with age.

On the other hand, the iPhone (green line) has also an average market share of about 30% but its profile is pretty flat in all age groups except 65 and up where it has considerable strength.

Further, the gap between Android and iPhone at the older age group actually opens up at 55 years and up. In the 55-64 age group, the iPhone holds a market share that is similar to its overall average while the Android performs quite a bit worse than its average. We note that Palm OS has some strength in the older age groups as well while the Blackberry also significantly underperforms in 65 and over.

Why aren't all these insights visible in the mosaic chart? It all because the chosen denominator of the entire market (as opposed to each age group) makes a lot of segments very small, and then the differences between small segments become invisible when placed beside much larger segments.

Now, the reconstituted chart gives no information about the relative sizes of the age groups. The market size for the older groups is quite a bit smaller than the younger groups. This information should be provided in a separate chart, or as a little histogram tucked under the age-group axis.

You can follow this conversation by subscribing to the comment feed for this post.

Excellent points. Breaking the original maosaic plot into a bar chart AND a line plot makes the most sense. You can do something similar to improve a Bloomburg chart that shows participation in social media for various age groups:
http://blogs.sas.com/iml/index.php?/archives/55-How-Does-Participation-in-Social-Media-Vary-with-Age.html

Great post!

Can you please provide the data you reverse engineered from the graphics. I am curious to see what happens when you rotate the mosaic or sort the categories by size and/or proportion.

Martin: some of the data are just guessestimates.

AgeGroup,PhoneOS,PercentOfTotal
18-24,Android,0.06
18-24,Blackberry,0.04
18-24,iPhone,0.04
18-24,Palm,0.01
18-24,Symbian,0.002
18-24,Windows,0.01
25-34,Android,0.08
25-34,Blackberry,0.07
25-34,iPhone,0.08
25-34,Palm,0.02
25-34,Symbian,0.01
25-34,Windows,0.015
35-44,Android,0.06
35-44,Blackberry,0.06
35-44,iPhone,0.06
35-44,Palm,0.02
35-44,Symbian,0.008
35-44,Windows,0.015
45-54,Android,0.04
45-54,Blackberry,0.05
45-54,iPhone,0.04
45-54,Palm,0.02
45-54,Symbian,0.01
45-54,Windows,0.01
55-64,Android,0.03
55-64,Blackberry,0.04
55-64,iPhone,0.04
55-64,Palm,0.02
55-64,Symbian,0.006
55-64,Windows,0.01
65-up,Android,0.01
65-up,Blackberry,0.01
65-up,iPhone,0.02
65-up,Palm,0.01
65-up,Symbian,0.003
65-up,Windows,0.001

For your consideration, I just spotted a chart here: http://online.wsj.com/article/SB10001424052748703529004576160764037920274.html?mod=WSJ_Tech_LEADTop Percentages add up to 200%, as a tiny foot-note explains. Ok, maybe there are technical issues that justify that, but, to a general audience, couldn't have they used the usual concept of per*cent*ages over *one hundred*. When I see 85% for US\$, "an overwhelming majority" comes to my mind, not "barely half".

Thanks Kaiser,

your guessestimates work fine for this purpose. Take a look at the rotated mosaic plot. I think it can make the same point as your suggested improvement.

You are perfectly right!

The comments to this entry are closed.