Bloomberg made me digest these graphics slowly

Ask the experts to name the success metric of good data visualization, and you will receive a dozen answers. The field doesn't have an all-encompassing metric. A useful reference is Andrew Gelman and Antony Urwin (2012) in which they discussed the tradeoff between beautiful and informative, which derives from the familiar tension between art and science.

For a while now, I've been intrigued by metrics that measure "effort". Some years ago, I described the concept of a "return on effort" in this post. Such a metric can be constructed like the dominating financial metric of return on investment. The investment here is an investment of time, of attention. I strongly believe that if the consumer judges a data visualization to be compelling, engaging or  ell constructed, s/he will expend energy to devour it.

Imagine grub you discard after the first bite, compared to the delicious food experienced slowly, savoring every last bit.

Bloomberg_ambridge_smI'm writing this post while enjoying the September issue of Bloomberg Businessweek, which focuses on the upcoming U.S. Presidential election. There are various graphics infused into the pages of the magazine. Many of these graphics operate at a level of complexity above what typically show up in magazines, and yet I spent energy learning to understand them. This response, I believe, is what visual designers should aim for.


Today, I discuss one example of these graphics, shown on the right. You might be shocked by the throwback style of these graphics. They look like they arrived from decades ago!

Grayscale, simple forms, typewriter font, all caps. Have I gone crazy?

The article argues that a town like Ambridge in Beaver County, Pennslyvania may be pivotal in the November election. The set of graphics provides relevant data to understand this argument.

It's evidence that data visualization does not need whiz-bang modern wizardry to excel.

Let me focus on the boxy charts from the top of the column. These:


These charts solve a headache with voting margin data in the U.S.  We have two dominant political parties so in any given election, the vote share data split into three buckets: Democratic, Republican, and a catch-all category that includes third parties, write-ins, and none of the above. The third category rarely exceeds 5 percent.  A generic pie chart representation looks like this:


Stacked bars have this look:


In using my Trifecta framework (link), the top point is articulating the question. The primary issue here is the voting margin between the winner and the second-runner-up, which is the loser in what is typically a two-horse race. There exist two sub-questions: the vote-share difference between the top two finishers, and the share of vote effectively removed from the pot by the remaining candidates.

Now, take another look at the unusual chart form used by Bloomberg:


The catch-all vote share sits at the bottom while the two major parties split up the top section. This design demonstrates a keen understanding of the context. Consider the typical outcome, in which the top two finishers are from the two major parties. When answering the first sub-question, we can choose the raw vote shares, or the normalized vote shares. Normalizing shifts the base from all candidates to the top two candidates.

The Bloomberg chart addresses both scales. The normalized vote shares can be read directly by focusing only on the top section. In an even two-horse race, the top section is split by half - this holds true regardless of the size of the bottom section.

This is a simple chart that packs a punch.


Election visual 3: a strange, mash-up visualization

Continuing our review of FiveThirtyEight's election forecasting model visualization (link), I now look at their headline data visualization. (The previous posts in this series are here, and here.)


It's a set of 22 maps, each showing one election scenario, with one candidate winning. What chart form is this?

Small multiples may come to mind. A small-multiples chart is a grid in which every component graphic has the same form - same chart type, same color scheme, same scale, etc. The only variation from graphic to graphic is the data. The data are typically varied along a dimension of interest, for example, age groups, geographic regions, years. The following small-multiples chart, which I praised in the past (link), shows liquor consumption across the world.

image from

Each component graphic changes according to the data specific to a country. When we scan across the grid, we draw conclusions about country-to-country variations. As with convention, there are as many graphics as there are countries in the dataset. Sometimes, the designer includes only countries that are directly relevant to the chart's topic.


What is the variable FiveThirtyEight chose to vary from map to map? It's the scenario used in the election forecasting model.

This choice is unconventional. The 22 scenarios is a subset of the 40,000 scenarios from the simulation - we are left wondering how those 22 are chosen.

Returning to our question: what chart form is this?

Perhaps you're reminded of the dot plot from the previous post. On that dot plot, the designer summarized the results of 40,000 scenarios using 100 dots. Since Biden is the winner in 75 percent of all scenarios, the dot plot shows 75 blue dots (and 25 red).

The map is the new dot. The 75 blue dots become 16 blue maps (rounded down) while the 25 red dots become 6 red maps.

Is it a pictogram of maps? If we ignore the details on the maps, and focus on the counts of colors, then yes. It's just a bit challenging because of the hole in the middle, and the atypical number of maps.

As with the dot plot, the map details are a nice touch. It connects readers with the simulation model which can feel very abstract.

Oddly, if you're someone familiar with probabilities, this presentation is quite confusing.

With 40,000 scenarios reduced to 22 maps, each map should represent 1818 scenarios. On the dot plot, each dot should represent 400 scenarios. This follows the rule for creating pictograms. Each object in a pictogram - dot, map, figurine, etc. - should encode an equal amount of the data. For the 538 visualization, is it true that each of the six red maps represents 1818 scenarios? This may be the case but not likely.

Recall the dot plot where the most extreme red dot shows a scenario in which Trump wins 376 out of 538 electoral votes (margin = 214). Each dot should represent 400 scenarios. The visualization implies that there are 400 scenarios similar to the one on display. For the grid of maps, the following red map from the top left corner should, in theory, represent 1,818 similar scenarios. Could be, but I'm not sure.


Mathematically, each of the depicted scenario, including the blowout win above, occurs with 1/40,000 chance in the simulation. However, one expects few scenarios that look like the extreme scenario, and ample scenarios that look like the median scenario.  

So, the right way to read the 538 chart is to ignore the map details when reading the embedded pictogram, and then look at the small multiples of detailed maps bearing in mind that extreme scenarios are unique while median scenarios have many lookalikes.

(Come to think about it, the analogous situation in the liquor consumption chart is the relative population size of different countries. When comparing country to country, we tend to forget that the data apply to large numbers of people in populous countries, and small numbers in tiny countries.)


There's a small improvement that can be made to the detailed maps. As I compare one map to the next, I'm trying to pick out which states that have changed to change the vote margin. Conceptually, the number of states painted red should decrease as the winning margin decreases, and the states that shift colors should be the toss-up states.

So I'd draw the solid Republican (Democratic) states with a lighter shade, forming an easily identifiable bloc on all maps, while the toss-up states are shown with a heavier shade.


Here, I just added a darker shade to the states that disappear from the first red map to the second.

Super-informative ping-pong graphic

Via Twitter, Mike W. asked me to comment on this WSJ article about ping pong tables. According to the article, ping pong table sales track venture-capital deal flow:


This chart is super-informative. I learned a lot from this chart, including:

  • Very few VC-funded startups play ping pong, since the highlighted reference lines show 1000 deals and only 150 tables (!)
  • The one San Jose store interviewed for the article is the epicenter of ping-pong table sales, therefore they can use it as a proxy for all stores and all parts of the country
  • The San Jose store only does business with VC startups, which is why they attribute all ping-pong tables sold to these companies
  • Startups purchase ping-pong tables in the same quarter as their VC deals, which is why they focus only on within-quarter comparisons
  • Silicon Valley startups only source their office equipment from Silicon Valley retailers
  • VC deal flow has no seasonality
  • Ping-pong table sales has no seasonality either
  • It is possible to predict the past (VC deals made) by gathering data about the future (ping-pong tables sold)

Further, the chart proves that one can draw conclusions from a single observation. Here is what the same chart looks like after taking out the 2016 Q1 data point:


This revised chart is also quite informative. I learned:

  • At the same level of ping-pong-table sales (roughly 150 tables), the number of VC deals ranged from 920 to 1020, about one-third of the vertical range shown in the original chart
  • At the same level of VC deals (roughly 1000 deals), the number of ping-pong tables sold ranged from 150 to 230, about half of the horizontal range of the original chart

The many quotes in the WSJ article also tell us that people in Silicon Valley are no more data-driven than people in other parts of the country.

Reading between the gridlines

Reader Jamie H. pointed me to the following chart in the Guardian (link), which originated from Spotify.


This chart is likely inspired by the Arctic ice cover chart discussed here last year (link):


Spotify calls its chart "the Coolness Spiral of Death" while the other one is called "Arctic Death Spiral".

The spiral chart has many problems, some of which I discussed in the post from last year. Just take a look at the headline, and then the black dotted spiral. Does the shape invoke the idea of rapid evolution, followed by maturation? Or try to figure out the amount of evolution between ages 18 and 30.


Instead of the V corner of the Trifecta, I'd like to focus on the D corner today. When I look at charts, I'm always imagining the data behind the chart. Here are some questions to ponder:

  • Given that Spotify was founded in 2006 (not quite 10 years ago), how are they able to discern someone's music taste from 14 through 48?
  • The answer to the above question is they don't have a longitudinal view of anyone's music taste. They are comparing today's 14-year-old kid with today's 48-year-old adult. Under what assumptions would such an analysis yield the same outcome as a proper analysis that tracks the same people over time?
  • If the phenomenon under study follows a predictable trend, there will be little difference between the two ways of looking at the data. For example, teeth in the average baby follow a certain sequence of emergence, first incisors at six months, and first molars at 14 months (according to Wikipedia). Observing John's teething at six months and David's at 14 months won't yield much difference from looking at John at six then 14 months. Does music taste evolve like human growth?
  • Unfortunately, no. Imagine that a new genre of music suddenly erupts and it becomes popular among every generation of listeners. This causes the Spotify curve to shift towards the origin at all ages. However, if you take someone who is currently 30 years ol, the emergence of the new genre should affect his profile at age 30 but not anytime before. In fact, the new music creates a sharp shift at different locations of everyone's taste profile depending on one's age!
  • Let's re-interpret the chart, and accept that each spoke in the wheel concerns a different cohort of people. So we are looking at generational differences. Is the Spotify audience representative of music listeners? Particularly, is each Spotify cohort representative of all listeners of that age?
  • I find it unlikely since Spotify has that "cool" factor. It is probably more representative for younger age groups. Among older customers, there should be some bias. How does this affect the interpretation of the taste profile?
  • If we find that one cohort differs from another cohort, it is important to establish that the gap is a generational difference and not due to the older age group being biased (self-selected) in some way.



World Bank fails to lead the way in #dataviz

Matthew Yglesias, writing for Vox, cited the following chart from a World Bank project:


His comment was: "We can see that while China has overtaken Germany and Japan to become the world's second-largest economy (i.e., total area of the rectangle) its citizens are nowhere near being as rich as those of those countries or even Mexico."

Yes, the chart encodes the size of the economy in a rectangular area, with one side being the per-capita GDP and the other being the population. I am not sure about the "we can see". I am not confident that the short and wide rectangle for China is larger than the thin and tall ones for Japan and for Germany. Perhaps Matthew is relying on knowledge in his head, rather than knowledge on the chart, to come to this conclusion.

This is the trouble with rectangular area charts: they have a nerdy appeal since side x side = area but as a communications device, they fail.

Here are some problems with the chart:

  • it's difficult to compare rectangular areas
  • the columns can only be sorted in one way (I'd have chosen to order it by population)
  • labeling is inelegant
  • colors are necessitated by the chart type not the data
  • the cumulative horizontal axis makes no sense unless the vertical axis is cumulative GDP (or cumulative GDP per capita)

Matthew should also have mentioned PPP (Purchasing Power Parity). If GDP is used as a measure of "wellbeing", then costs of living should be taken into account in addition to incomes. The cost of living in China is much lower than in Japan or Germany and using the prevailing exchange rates disguises this point.

In the Trifecta Checkup, this is a Type QDV.


Try your hand at fixing this one. There are no easy solutions. Does interactivity help? How about multiple charts? You will learn why I classify it as QDV instead of just DV.


[Update, 8/18/2014:] Xan Gregg created a scatter plot version of the chart. He also added, "There is still the issue of what the question is, but I'm assuming it's along the lines of "How do economies compare regarding GDP, population, and GDP/capita?" I'm using the PPP-based GDP, but I didn't read the report carefully enough to figure out if another measure was better."



A reader submits a Type DV analysis

Darin Myers at PGi was kind enough to send over an analysis of a chart using the Trifecta Checkup framework. I'm reproducing the critique in full, with a comment at the end.



At first glance this looks like a valid question, with good data, presented poorly (Type V). Checking the fine print (glad it’s included), the data falls apart.


It’s a good question…What device are we using the most? With so much digital entertainment being published every day, it pays to know what your audience is using to access your content. The problem is this data doesn’t really answer that question conclusively.


This was based on Survey data asking respondents “Roughly how long did you spend yesterday…watching television (not online) / using the internet on a laptop or PC / on a smartphone / on a tablet? Survey respondents were limited to those who owned or had access to a TV and a smartphone and/or tablet.

  • What about feature phones?
  • Did they ask everyone on the same day, random days, or are some days over represented here?
  • This is self-reported, not tracked…who accurately remembers their average screen time on each device a day later? I imagine the vast majority of answers were round numbers (30, 45 minutes or 2 hours). This data shows accuracy to the minute that is not really provided by the users.

In fact the Council for Research Excellence found that self-reported screen time does not correlate with actual screen time. “Some media tend to be over-reported whereas others tend to be under-reported – sometimes to an alarming extent.” -Mike Bloxham, director of insight and research for Ball State


The visual has the usual problems with stacked bar charts where it is easy to see the first bar and the total, but not to judge the other values. This may not be an issue based on the question, but the presentation is focusing on an individual piece of tech (smartphones), so the design should focus on smartphones. At the very least, smartphones should be the first column in the chart and it should be sorted by smartphone usage.

My implementation is simply to compare the smartphone usage to the usage of the next highest device. Overall 53% of the time people are using a smartphone compared to something else. I went back and forth on whether I should keep the Tablet category in the Key though it was not the first or second used device. In the end, I decided to keep it to parallel the source visual.


Despite the data problems, I was really interested in seeing the breakdowns in each country by device, so I built the chart below with rank added (in bold). I also built some simple interaction to sort by column when you click the header [Ed: I did not attach the interactive excel sheet that came with the submission]. As a final touch, I displayed the color corresponding to the highest usage as a box to the left of the country name. It’s easy to see that the vast majority of countries use smartphones the most.



Hope you enjoyed Darin's analysis and revamp of the chart. The diagnosis is spot on. I like the second revision of the chart, especially for analysts who really want to know the exact numbers. The first redo has the benefit of greater simplicity--it can be a tough sell to an audience, especially when using color to indicate the second most popular device while disassociating the color and the length of the bar.

The biggest problem in the original treatment is the misalignment of the data with the question being asked. In addition to the points made by Darin, the glaring issue relates to the responder population. The analysis only includes people who have at least a smartphone or a tablet. But many people in lesser developed countries do not have either device. In those countries, it is likely that the TV screen time has been strongly underestimated. People who watch TV but do not own a smartphone or tablet are simply dropped from consideration.

For this same reason, the other footnoted comment claiming that the sampling frame accounts for ~70 percent of the global population is an irrelevance.

Small multiples with simple axes

Jens M., a long-time reader, submits a good graphic! This small-multiples chart (via Quartz) compares the consumption of liquor from selected countries around the world, showing both the level of consumption and the change over time.


What they did right:

  • Did not put the data on a map
  • Ordered the countries by the most recent data point rather than alphabetically
  • Scale labels are found only on outer edge of the chart area, rather than one set per panel
  • Only used three labels for the 11 years on the plot
  • Did not overdo the vertical scale either

The nicest feature was the XL scale applied only to South Korea. This destroys the small-multiples principle but draws attention to the top left corner, where the designer wants our eyes to go. I would have used smaller fonts throughout.

Having done so much work to simplify the data and expose the patterns, it's time to look at whether we can add some complexity without going overboard. I'd suggest using a different color to draw attention to curves that are strangely shaped -- the Ukraine comes to mind, so does Brazil.

I'd also consider adding the top liquor in each country... the writeup made a big deal out of the fact that most of the drinking in South Korea is of Soju.


One way to appreciate the greatness of the chart is to look at alternatives.

Here, the Economist tries the lazy approach of using a map: (link)


For one thing, they have to give up the time dimension.

A variation is a cartogram in which the physical size and shape of countries are mapped to the underlying data. Here's one on Worldmapper (link):


One problem with this transformation is what to do with missing data.

Wikipedia has a better map with variations of one color (link):


The Atlantic realizes that populations are not evenly distributed on the map so instead of coloring countries, thay put bubbles on top of the map (link):

Theatlantic_Global Beer Consumption-thumb-590x411-31757

 Unfortunately, they scaled the bubbles to the total consumption rather than the per-capita consumption. You guess it, China gets the biggest bubble and much larger than anywhere else but from a per-capita standpoint, China is behind many other countries depicted on the map.


PS. A note on submissions. I welcome submissions, especially if you have a good chart to offer. Please ping me if I don't reply within a few weeks. I may have just missed your email. Also, realize that submissions take even more time to research since it is likely in the area I have little knowledge about, and mostly because you sent it to me since you hope I'll research it. Sometimes I give up since it's taking too much time. If you ping me again, I'll let you know if I'm working on it.

The above does not apply to emails from people who are building traffic for their infographics.


PPS. Andrew Gelman chimes in with his take on small multiples.

Beyond the obvious

Flowing Data has been doing some fine work on the baby names data. The names voyager is a successful project by Martin Wattenberg that has received praise from many corners. It's one of these projects that have taken on a commercial life as you can see from the link.

Here is a typical area chart presentation of the baby names data:


The typical insight one takes from this chart is that the name "Michael" (as a boy's name) reached a peak in the 1970s and have not been as popular lately. The data is organized as a series of trend lines, for each name and each gender.

Speaking of area charts, I have never understood their appeal. If I were to click on Michael in the above chart, the design responds by restricting itself to all names starting with "Michael", meaning it includes Michael given to a girl, and Michaela, for example. See below.


What is curious is that the peak has a red lining. At first thought, one expects to find hiding behind the blue Michael a girl's name that is almost as popular. But this is a stacked area chart so in fact, the girl's name (Michael given to a girl, if you mouse over it) is much less popular than the boy Michael (20,000 to 500 roughly).


Nathan decides to dig a layer deeper. Is there more information beyond the popularity of baby names over time?

In this post, Nathan zones in on the subset of names that are "unisex," that is to say, have been used to name both boys and girls. He selects the top 35 names based on a mean-square-error criterion and exposes the gender bias for each name. The metric being plotted is no longer pure popularity but gender popularity. The larger the red area, the greater the proportion of girls being given that name.

You can readily see some interesting trends. Kim (#34) has become almost predominantly female since the 1960s. On the other hand, Robbie (#18) used to be predominantly female but is now mostly a boy's name.


 One useful tip when performing this analysis is to pay attention to the popularity of each name (the original metric) even though you've decided to switch to the new metric of gender bias. This is because the relative proportions are unstable and difficult to interpret for less popular names. For example, the Name Voyager shows no values for Gale (#29) after the 1970s, which probably explains the massive gyrations in the 1990s and beyond.

Cat and dog food, for thought

My friend Rhonda (@RKDrake) sends me to this pair of charts (in BusinessWeek). They are fun to look at, and ponder at. 

Bw_catdogHere's the first chart:

 Should the countries be colored according to the distance from the Equator?

Is this implying that cats and dogs have different preferential habitats?

Is there a lurking variable that is correlated with distance from equator?

What is the relationship between cat and dog owners?

Is there any significance to countries sitting on that diagonal, whereby the porportion of households owning dogs is the same as that owning cats?

In particular, what proportion of these households have both dogs and cats?

If 20% of households have cats, and 20% of households have dogs, how much of these households are the same ones?

How are the countries selected?

Where does the data come from?

The data provider is named but is the data coming from surveys? Are those randomized surveys?

Are the criteria used to collect data the same across all these countries?

The other chart is about cat and dog food. Again, nice aesthetics, clean execution. Lots of questions but worth looking at. Enjoy.


Breaking every limb is very painful

This Financial Times chart is a big failure:


Look at the axis. Usually a break in the axis is reserved for outliers. If there is one bar in a bar chart that extends way beyond the rest of the data, then you would sever that bar to let readers know that the scale is broken. Here, the designer broke every bar in the entire chart. It's as if the designer knows we'll complain about not starting the chart at zero -- so the bars all start at zero except they jump from zero to 70 right away.


Trifecta_checkupThe biggest issue with this chart is not its graphical element. It's the other two corners of the Trifecta checkup: what is the question being asked? And what data should be used to address that question?

The accompanying article complains about the dearth of HB1 H-1B visas for technical talent at businesses. But it never references the data being plotted.

It's hard for me to even understand what the chart is saying. I think it is saying that in Bloomington-Normal, IL, 94.8 percent of its HB1 H-1B visa requests are science related. There is no way to interpret this number without knowing the percentage for the entire country. It is most likely true that HB1 H-1B visas are primarily used to recruit technical talent from overseas, and the proportion of such requests that are STEM related is high everywhere. In this sense, it's not clear that the proportion of HB1 H-1B requests is a useful indicator of the dearth of technical talent.

Secondly, it is highly unlikely that the decimal point is meaningful. Given the highly variable total number of requests across different locations, the decimal point would represent widely varying numbers of requests.

I'd prefer to look at absolute number of requests for this type of analysis, given that Silicon Valley has orders of magnitude more technical jobs than most of the other listed locations. Requests aren't even a good indicator of labor shortage. Typically HB1 H-1B visas run up against the quota sometime during the year, and companies will stop requesting new visas since there is no chance of getting approved. This is a form of survivorship bias. Wouldn't it be easier to collect data on the number of vacant technical jobs in each location?