A reader submits a Type DV analysis

Darin Myers at PGi was kind enough to send over an analysis of a chart using the Trifecta Checkup framework. I'm reproducing the critique in full, with a comment at the end.

***

Kpcbtrends96

At first glance this looks like a valid question, with good data, presented poorly (Type V). Checking the fine print (glad it’s included), the data falls apart.

Question

It’s a good question…What device are we using the most? With so much digital entertainment being published every day, it pays to know what your audience is using to access your content. The problem is this data doesn’t really answer that question conclusively.

DATA

This was based on Survey data asking respondents “Roughly how long did you spend yesterday…watching television (not online) / using the internet on a laptop or PC / on a smartphone / on a tablet? Survey respondents were limited to those who owned or had access to a TV and a smartphone and/or tablet.

  • What about feature phones?
  • Did they ask everyone on the same day, random days, or are some days over represented here?
  • This is self-reported, not tracked…who accurately remembers their average screen time on each device a day later? I imagine the vast majority of answers were round numbers (30, 45 minutes or 2 hours). This data shows accuracy to the minute that is not really provided by the users.

In fact the Council for Research Excellence found that self-reported screen time does not correlate with actual screen time. “Some media tend to be over-reported whereas others tend to be under-reported – sometimes to an alarming extent.” -Mike Bloxham, director of insight and research for Ball State

VISUAL

The visual has the usual problems with stacked bar charts where it is easy to see the first bar and the total, but not to judge the other values. This may not be an issue based on the question, but the presentation is focusing on an individual piece of tech (smartphones), so the design should focus on smartphones. At the very least, smartphones should be the first column in the chart and it should be sorted by smartphone usage.

My implementation is simply to compare the smartphone usage to the usage of the next highest device. Overall 53% of the time people are using a smartphone compared to something else. I went back and forth on whether I should keep the Tablet category in the Key though it was not the first or second used device. In the end, I decided to keep it to parallel the source visual.

Myers_redokpcbtrend96a

Despite the data problems, I was really interested in seeing the breakdowns in each country by device, so I built the chart below with rank added (in bold). I also built some simple interaction to sort by column when you click the header [Ed: I did not attach the interactive excel sheet that came with the submission]. As a final touch, I displayed the color corresponding to the highest usage as a box to the left of the country name. It’s easy to see that the vast majority of countries use smartphones the most.

Myers_redokpcbtrend96b

***

Hope you enjoyed Darin's analysis and revamp of the chart. The diagnosis is spot on. I like the second revision of the chart, especially for analysts who really want to know the exact numbers. The first redo has the benefit of greater simplicity--it can be a tough sell to an audience, especially when using color to indicate the second most popular device while disassociating the color and the length of the bar.

The biggest problem in the original treatment is the misalignment of the data with the question being asked. In addition to the points made by Darin, the glaring issue relates to the responder population. The analysis only includes people who have at least a smartphone or a tablet. But many people in lesser developed countries do not have either device. In those countries, it is likely that the TV screen time has been strongly underestimated. People who watch TV but do not own a smartphone or tablet are simply dropped from consideration.

For this same reason, the other footnoted comment claiming that the sampling frame accounts for ~70 percent of the global population is an irrelevance.


Two charts that fail self-sufficiency

My twitter followers have been sending in several howlers.

Twitter (link) made a bunch of bold claims about its own influence by using the number of tweets about the Oscars as fodder. They also adopt the euphenism common to the digital marketing universe, the so-called "view", which credit to them, they define as "how many times tweets are displayed to users". Yes, you read that right, displaying is the same as viewing in this world - and Twitter is just a follower not a trend setter here.

For @dtellom, it is this bubble chart about the Ellen tweet that displeased him:

Twitter_ellenimpressions_0

 

In the meantime, @wilte found this unfortunate donut chart, created by PWC in the Netherlands.

PWCG_donut

Both designers basically used appropriated a graphical form and deprived it of data. In one, the designer threw the concept of scale to the wind. In the other, the designer dumped the law of total probability. In either case, the fundamental rationale for the particular graphical form is sacrificed.

Both are examples that fail our self-sufficiency test. This test says if a visual display cannot be understood unless the entire data set is printed on the chart, then why create a visual display? In both charts, if you block out the numbers, you are left with nothing!

***

The PWC chart was submitted by @graphomate, who also submitted the following KPMG chart:

KPMG_donut

The complaint was the total adding up to 101%. I'm not really bothered by this as it is a rounding issue. That said, I like to "hide" such rounding issues. I have never understood why it is necessary to display the imperfection. Flip a coin and remove the decimals from one of the categories!


The incredibly expanding male

It's a mystery to me how there are always people who ignore certain rudimentary rules of graphing data. I'm talking about such clear guidelines as:

  • Bar charts encode data in the heights of the bars -- therefore:
  • You should start each bar at height zero, and
  • You should not vary the width of the bars (unless you are introducing another dimension), and
  • You should space the bars unevenly if your measurement times are unevenly spaced.

I mean, how is it in the year 2013, the BBC shows viewers this? (tip from UK reader Clarke C.)

Bbc_humangrowth_sm

The chart is absurd on its face. Men did not double in height between 1871 and 1971.  This chart was broadcast in the show "breakfast" which apparently is the BBC UK version of Good Morning America.

I'd just use a line chart. The figurine construct is cute but too much trouble because you have to grow the width while growing the height. If you encode data in the area, then the height is no longer proportional to the real height.

Years ago, we featured something similar: how penguins evolved into humans (link). Curiously, also a gift from British media.


Leave good alone

In Cousin misfit, we looked at a problematic area chart in which the areas on the chart contain no useful information. The lines in a line chart should carry some meaning, and so too should areas in an area chart.

Wsj_samsung

The Wall Street Journal recently printed something that looked like a cross between a column chart, an area chart, and a flow chart.  Whatever it is, the areas of the pieces do not match the data.

The data describes how the TV market is split between the top 5 brands (comprising over 50% of the total unit sales) and all other brands -- basically the six numbers printed on the chart.

The graphical construct can be broken up into three parts: a stacked column (on the left), a stacked column with gaps (on the right), and some connecting areas (which are parallelograms).

The last two parts are unnecessary, and in particular, the parallelograms distort the total areas.

It can be baffling to the reader why the left column is shorter than the right column when both show the identical data.

At first, I thought this is some kind of flow chart illustrating the change in market share over time but that's not the case.

What's wrong with the standard stacked column?


Reference: "Samsung Edges out TV Rivals", Wall Street Journal, Feb 17 2010.