Guess what the designer at Nielsen wanted to tell you with this chart:
- The smartphone (OS) market is dominated by three top players (Android, Apple and Blackberry) each having roughly 30% share, while others split the remaining 10%.
- The age-group mix for each competitor is similar (or are they?)
Maybe those are the messages; if so, there is no need to present a bivariate plot (the so-called "mosaic" plot, or in consulting circles, the Marimekko). Having two charts carrying one message each would accomplish the job cleanly.
The two columns, counting from the right, contain rectangles that appear to be of different sizes, and yet the data labels claim each piece represents 1%, and in some cases "< 1%". The simultaneous manipulation of both the height and the width plays mind tricks.
Also, while one would ordinarily applaud the dropping of decimals from a chart like this, doing so actually creates the ugly problem that the five pieces of 1% (on the left column shown here) have the same width but clearly varying heights!
What about this section of the plot shown on the left? Does the smaller green box look like it's less than 1/3 the size of the longer green box? This chart is clearly not self-sufficient, and as such one might prefer a simple data table.
The downfall of the mosaic plot is that it gives the illusion of having two dimensions but only an illusion: in fact, the chart is dominated by one dimension, as all proportions are relative to the grand total.
For instance, the chart says that 6% of all smartphone users are between the ages of 18 and 24 AND uses an Android phone. It also tells us that 2% of all smartphone users are between 35 and 44 AND uses a Palm phone. Those are not two numbers anyone would desire to compare. There are hardly any practical questions that require comparing them.
Sometimes, the best way to handle two dimensions is not to use two dimensions.
The original article notes that "Of the three most popular smartphone operating systems, Android seems to attract more young consumers." In the chart shown below, we assume that the business question is the relative popularity of phone operating systems across age groups.
The right metric for comparison is the market share of each OS within an age group.
For example, tracing the black line labeled "Android", this chart tells us that Android has 37% of the 18-24 market while it has about 20% of the 65 and up market.
Android has an overall market share of about 30%, and that average obscures a youth bias that is linear with age.
On the other hand, the iPhone (green line) has also an average market share of about 30% but its profile is pretty flat in all age groups except 65 and up where it has considerable strength.
Further, the gap between Android and iPhone at the older age group actually opens up at 55 years and up. In the 55-64 age group, the iPhone holds a market share that is similar to its overall average while the Android performs quite a bit worse than its average. We note that Palm OS has some strength in the older age groups as well while the Blackberry also significantly underperforms in 65 and over.
Why aren't all these insights visible in the mosaic chart? It all because the chosen denominator of the entire market (as opposed to each age group) makes a lot of segments very small, and then the differences between small segments become invisible when placed beside much larger segments.
Now, the reconstituted chart gives no information about the relative sizes of the age groups. The market size for the older groups is quite a bit smaller than the younger groups. This information should be provided in a separate chart, or as a little histogram tucked under the age-group axis.