Nielsen's cross-platform crossing diagram crosses up readers

My friend Augustine F., who's a data-savvy guy, couldn't figure out what's going on with this chart in Nielsen's cross-platform report.


It's a case of a Bumps chart done poorly.

The reader must first read the beginning pages of the report to find one's bearing. The two charts are supposed to investigate the correlation between streaming video and regular TV. What causes the confusion is that the populations being analyzed are different between the two charts.

In the left chart, they exclude anyone who do not watch streaming video (35% of the sample), and then divide those who watch streaming video into five equal-sized segments based on how much they watch. Then, they look at how much regular TV each segment watches on average.

In the right chart, they exclude anyone who do not watch regular TV (just 0.5% of the sample), and then divide those who watch regular TV into five equal-sized segments based on how much they watch. Then, they look at how much online streaming video each segment watches on average.


What crosses us up is the relative scales. The scale for regular TV viewing is tightly clustered between 212 and 247 daily minutes on the left chart but has a wide range between 24 and 522 on the right chart. The impression given by the designer is that the same population (18-34 year olds) is divided into five groups (quintiles) for each chart, albeit using different criteria. It just doesn't make sense that the group averages do not match.

The reason for this mismatch is the hugely divergent rates of exclusion as described above. What the chart seems to be saying is that the 65% who use streaming video have very similar TV viewing behavior (about 220 daily minutes). In other words, we surmise that most of those people on the left chart map to groups 2 and 3 on the right chart.

Who are the people in groups 1, 4 and 5 on the right chart? It appears that they are the 35% who don't watch streaming video. Thus, the real insight of this chart is that there are two types of people who don't watch streaming video: those who watch very little regular TV at all, and those who watch twice the average amount of regular TV.


Here's another puzzle: Nielsen claims that high streaming = low TV and low streaming = high TV. Is it really true that high streaming = low TV? Take the segment of highest streaming (#1 on the left chart). This group, which is 13% of the survey population, accounts for 83% of the streaming minutes -- almost 71,000 out of 86,000 minutes. Now look at the right chart. It turns out that the streaming minutes are quite evenly distributed among those TV-based quintiles, ranging from 15,000 minutes to 23,000 minutes each.

So, it is impossible to fit all of the top streaming quintile into any one TV quintile - they have too many streaming minutes. In fact, the top streaming quintile must be quite spread out among the TV quintiles since each of the TV quintiles is 1.5 times the size of a streaming quintile!

So, we must conclude that customers who stream a lot include both fervent TV fans as well as those who watch little TV.


In a return-on-effort analysis, this is a high-effort, low-reward chart.


Peek into beauty

This graphic feature is the best from the NYT team yet. I particularly love the two columns on the right which allows us to see regional differences.  For example, this "New in Town" movie was much popular in Minneapolis than any of the other metropolitan areas, and was particularly unwatched in New York.  Also, note the choice of sorting allowed on the top right.

Click here and enjoy!


Reference: "A Peek into Netflix Queues", New York Times, Jan 10 2009.


Here are some of my favorite links from other places:

GeneticsA spatial journey illustrating a very long scale, created by the Genetic Science Learning Center (here)

Long scales are very difficult to deal with in charts; I have never been satisfied with log scales since it addresses the designer's challenge of trying to fit everything onto one page, bu does not deal with the reader's need to compare the elements accurately

Not sure how this helps but perhaps some of you will figure it out

Movie_narrative_charts_large Tommi left a comment about this conceptual chart by xkcd, which has been making the rounds.  Fits into our Light Entertainment category.

Says there is no optimal chart type.  A type that works very well for one data set may get hopelessly cluttered for another, similar data set.

Unemploystate From fellow bloggers (especially Jorge), a whole series of views of the U.S. unemployment figures by state over time.  Alternatives that are much more interesting to look at than the typically line chart. Jorge even found something in Excel that looks good.

Playful and exploratory

I share reader Bernard L.'s enthusiasm for this very imaginative chart, courtesy of the graphics people at NYT.  The chart captures the ebb and flow of weekly movie receipts over the last two decades.
The details that particularly interest me include:

  • The addition of area colors (on top of lines) serves to highlight box office successes; this really helps readers sort out the massive amount of data
  • Nicely spaced text (and dots) does not interfere with our reading of the chart
  • The hiding of text for less important films, plus taking advantage of interactivity to show their titles if the reader mouses over the respective areas

All of the above indicate a keen sense of foreground versus background.  Besides, the authors had the good sense to speak of inflation-adjusted box office sales; I'm tired of the movie industry proclaiming higher sales each year when ticket prices are rising, and the population is growing.

This is another chart where more data do not easily translate into better communication (see my guest post at Flowing Data).  While I like the playful nature of the interactive chart, it is left to the reader to discover the information buried in the data, such as the assertion in the header that Oscar-winning films typically take time to attain box-office success while many blockbusters do not Oscars make.

In this presentation, it is challenging to compare the total receipts of one film versus another (this requiring comparing oddly shaped, partially obscured areas).  It is also hard to compare across years since the data is spread out over a lot of space.

There may really be two types of graphics: the one like the example here which is a dictionary and designed for exploration; and the other kind where the designer has selected a subset of the data to make a specific point.

Reference: "The ebb and flow of movies", New York Times, Feb 23 2008.

Oscar diseconomy

OscarBusiness Week dissected the beneficiaries of the Oscar show as shown on the right.  Although this doesn't work well as a data graphic, if thought as a variant on the data table, it is more engaging for readers.

Lets have some fun with the Oscar statue.  First, putting a bar chart next to the statue confirms that the height of the segments (rather than the area) is in proportion to the dollar values (below left).

Tufte, Chambers and others have shown that our eyes react to the areas, not heights.  So next, I estimated the areas but stretched them out into segments of equal width.  Squeezing the entire column back down to the height of the statue, the following chart (below right) puts perceived proportions next to the true proportions, displaying visually the extent of distortion. 


Reference: "News you need to know", Business Week, Jan 28 2008.

One for the cutting room floor

This chart comparing U.S. and China garment markets really calls for mixed metaphors: it should've been left on the cutting room floor.
FashionThe two intended messages are simple: the U.S market is much larger but the China market is growing much faster.  But the chart manages to confuse us all the same.

First, the China market is tracked for 14.5 years versus 3.5 for the U.S. market, without explanation.  By stacking these bars together, the chart creates a false impression of exponential growth.

Then, the data from 2002-5 are enclosed in gray boxes of arbitrary heights, interfering with our ability to read the trends.  While I don't like gridlines on bar charts in general, their appearance in these redundant gray boxes really beggars explanation.

Even the vertical scale needs re-editing.  Why can't they halve the line segments and place the numbers next to the lines?  The omission of the zero-dollar line may mislead some into thinking that the $10 billion line represents zero.

Last, but not least, the black, half-baked bars (representing the first half of 2005) impair our comprehension.  Their presence adds nothing to the graph at all.  Indeed, if the reader is to pick up the fast growth of the Chinese market, seeing the plunging and darkly accented last bar surely doesn't help.  Here, the chart designer has two choices: draw the projected full-year 2005 data or omit 2005 altogether.

I suspect that the bar chart format was selected, partly in order to accommodate these half-baked bars.  Otherwise, a line chart would work nicely in this context.  RedogarmentAs an alternative, the following conceptual graph (since I don't have data) brings out the two messages much more clearly.

The gap between the two sets of points illustrates the relative sizes of the two markets  while the steeper line for China shows its much faster growth.


Reference: "Chinese Apparel Makers Seek the Creative Work", New York Times, Sept 1 2005.