Depicting imbalance, straying from the standard chart

My friend Tonny M. sent me a tip to two pretty nice charts depicting the state of U.S. healthcare spending (link).

The first shows U.S. as an outlier:


This chart is a replica of the Lane Kenworthy chart, with some added details, that I have praised here before. This chart remains one of the most impactful charts I have seen. The added time-series details allow us to see a divergence from about 1980.


The second chart shows the inequity of healthcare spending among Americans. The top 10% spenders consume about 6.5 times as much as the average while the bottom 16% do not spend anything at all.


This chart form is standard for depicting imbalance in scientific publications. But the general public finds this chart difficult to interpret, mostly because both axes operate on a cumulative scale. Further, encoding inequity in the bend of the curve is not particularly intuitive.

So I tried out some other possibilities. Both alternatives are based on incremental, not cumulative, metrics. I take the spend of the individual ten groups (deciles) and work with those dollars. Also, I provide a reference point, which is the level of spend of each decile if the spend were to be distributed evenly among all ten groups.

The first alternative depicts the "excess" or "deficient" spend as column segments. Redo_healthcarespend1

The second alternative shows the level of excess or deficient spending as slopes of lines. I am aiming for a bit more drama here.


Now, the interpretation of this chart is not simple. Since illness is not evenly spread out within the population, this distribution might just be the normal state of affairs. Nevertheless, this pattern can also result from the top spenders purchasing very expensive experimental treatments with little chance of success, for example.


Brexit, Bremain, the world did not end so dataviz people can throw shade and color

Catching a dose of Alberto Cairo the other day. He has a good post about various Brexit/Bremain maps.

The story started with an editor of The Spectator, who went on twitter to make the claim that the map on the right is better than someone else's map on the left:


There are two levels at which we should discuss these maps: the scaling of the data, and the mapping of colors.

The raw data are percentages based on counts of voters so the scale is decimal. In general, we discretize the decimal data in order to improve comprehension. Discretizing means we lose granularity. This is often a good thing. The binary map on the left takes the discretization to its logical extreme. Every district is classified as either Brexit (> 50% in favor) or Bremain (> 50% opposed). The map on the right uses six total groups (so three subgroups of Brexit and three subgroups of Bremain.

Then we deal with mapping of numbers to colors. The difference between these two maps is the use of hues versus shades. The binary map uses two hues, which is probably most people's choice since we are representing two poles. The map on the right uses multiple shades of one hue. Alternatively, Alberto favors a "diverging" color scheme in which we use three shades of two hues.

The editor of The Spectator claims that his map is more "true to the data." In my view, his statement applies in these two senses: the higher granularity in the scaling, and also, the fact that there is only one data series ("share of vote for Brexit") and therefore only one color.

The second point relates to polarity of the scale. I wrote about this issue before - related to a satisfaction survey designed (not too well) by SurveyMonkey, one of the major online survey software services. In that case, I suggested that they use a bipolar instead of unipolar scale. I'd rather describe my mood as somewhat dissatisfied instead of a little bit satisfied.

I agree with Alberto here in favor of bipolarity. It's quite natural to underline the Brexit/Bremain divide.


Given what I just said, why complain about the binary map?

We agree with the editor that higher granularity improves comprehension. We just don't agree on how to add graularity. Alberto tells his readers he likes the New York Times version:


This is substantively the same map as The Spectator's, except for 8 groups instead of 6, and two hues instead of one.

Curiously enough, I gave basically the same advice to the Times regarding their maps showing U.S. Presidential primary results. I noted that their use of two hues with no shades in the Democratic race obscures the fact that none of the Democratic primiaries was a winners-take-all contest. Adding shading based on delegate votes would make the map more "truthful."

That said, I don't believe that the two improvements by the Times are sufficient. Notice that the Brexit referendum is one-person, one-vote. Thus, all of the maps above have a built-in distortion as the sizes of the regions are based on (distorted) map areas, rather than populations. For instance, the area around London is heavily Bremain but appears very small on this map.

The Guardian has a cartogram (again, courtesy of Alberto's post) which addresses this problem. Note that there is a price to pay: the shape of Great Britain is barely recognizable. But the outsized influence of London is properly acknowledged.


 This one has two hues and four shades.  For me, it is most "truthful" because the sizes of the colored regions are properly mapped to the vote proportions.

Showing three dimensions using a ternary plot

Long-time reader Daniel L. isn't a fan of this chart, especially when it is made to spin, as you can see at this link:


Like other 3D charts, this one is hard to read. The vertical lines are both good and bad: They make the one dimension very easy to read but their very existence makes one realize the challenges of reading the other dimensions without guidelines.

This dataset allows me to show a ternary plot. The ternary plot is an ingenious way of putting three dimensions onto a flat surface. I have found few good uses of this chart type, though.


Let's get to the core of the issue: the analyst started with 25 skills that are frequently required by data science and analytics jobs, and his goal is to classify these skills into three groups. The underlying method used to create these groups is factor analysis.

Each dot above is a skill. The HQ of each grouping of skills (known as a factor) is a corner of the plot. The closer the dot is to the corner, the more relevant that skill is to the skill group.

In the above chart, I highlighted four skills that are not clearly in one or another skill group. For example, Commuication straddles the Math/Stats and Business dimensions but scores lowly on the Technology/Programming dimension.


The ternary plot has a few problems. Like any scatter plot, once you have 10 or more dots, it is hard to fit all the data labels. Further, the axis labels must be carefully done to help readers understand the plot. 

Before long, the chart looks very cluttered. There just isn't enough room to get all your words in. Here is another version of the same chart -- wiht a different set of annotation.


Instead of drawing attention to those skills that have no clear home, this version of the chart focuses on the dots close to each corner.

In two cases, I classified two of the skills differently from the original. The Machine Learning skill is part of Math/Stats on my charts but it is part of Technology/Programming on the original.

The ternary plot is interesting and unusual but is only useful in selected problems.

Egregious chart brings back bad memories

My friend Alberto Cairo said it best: if you see bullshit, say "bullshit!"

He was very incensed by this egregious "infographic": (link to his post)


Emily Schuch provided a re-visualization:


The new version provides a much richer story of how Planned Parenthood has shifted priorities over the last few years.

It also exposed what the AUL (American United for Life) organization distorted the story.

The designer extracted only two of the lines, thus readers do not see that the category of services that has really replaced the loss of cancer screening was STI/STD testing and treatment. This is a bit ironic given the other story that has circulated this week - the big jump in STD among Americans (link).

Then, the designer placed the two lines on dual axes, which is a dead giveaway that something awful lies beneath.

Further, this designer dumped the data from intervening years, and drew a straight line from the first to the last year. The straight arrow misleads by pretending that there has been a linear trend, and that it would go on forever.

But the masterstroke is in the treatment of the axes. Let's look at the axes, one at a time:

The horizontal axis: Let me recap. The designer dumped all but the starting and ending years, and drew a straight line between the endpoints. While the data are no longer there, the axis labels are retained. So, our attention is drawn to an area of the chart that is void of data.

The vertical axes: Let me recap. The designer has two series of data with the same units (number of people served) and decided to plot each series on a different scale with dual axes. But readers are not supposed to notice the scales, so they do not show up on the chart.

To summarize, where there are no data, we have a set of functionless labels; where labels are needed to differentiate the scales, we have no axes.


This is a tried-and-true tactic employed by propagandists. The egregious chart brings back some bad memories.

Here is a long-ago post on dual axes.

Here is Thomas Friedman's use of the same trick.

More chart drama, and data aggregation

Robert Kosara posted a response to my previous post.

He raises an important issue in data visualization - the need to aggregate data, and not plot raw data. I have no objection to that point.

What was shown in my original post are two extremes. The bubble chart is high drama at the expense of data integrity. Readers cannot learn any of the following from that chart:

  • the shape of the growth and subsequent decline of the flu epidemic
  • the beginning and ending date of the epidemic
  • the peak of the epidemic*

* The peak can be inferred from the data label, although there appears to be at least one other circle of approximately equal size, which isn't labeled.

The column chart is low drama but high data integrity. To retain some dramatic element, I encoded the data redundantly in the color scale. I also emulated the original chart in labeling specific spikes.

The designer then simply has to choose a position along these two extremes. This will involve some smoothing or aggregation of the data. Robert showed a column chart that has weekly aggregates, and in his view, his version is closer to the bubble chart.

Robert's version indeed strikes a balance between drama and data integrity, and I am in favor of it. Here is the idea (I am responsible for the added color).



Where I depart from Robert is how one reads a column chart such as the one I posted:


Robert thinks that readers will perceive each individual line separately, and in so doing, "details hide the story". When I look at a chart like this, I am drawn to the envelope of the columns. The lighter colors are chosen for the smaller spikes to push them into the background. What might be the problem are those data labels identifying specific spikes; they are a holdover from the original chart--I actually don't know why those specific dates are labeled.


In summary, the key takeaway is, as Robert puts it:

the point of this [dataset] is really not about individual days, it’s about the grand totals and the speed with which the outbreak happened.

We both agree that the weekly version is the best among these. I don't see how the reader can figure out grand totals and speed with which the outbreak happened by staring at those dramatic but overlapping bubbles.

A not-so-satisfying rose

At the conference in Bavaria, Jay Emerson asked participants to provide comments on the data visualization of the 2014 Environmental Performance Index (link). We looked at the country profiles in particular. Here is one for Singapore:


The main object of interest here is the "rose chart." To understand it, we need to know the methodology behind the index. The index is a weighted average of nine sub-indices, as shown in the table at the bottom. In many cases, the sub-index is itself an average of sub-sub-indices. These lower-level indices measure the distance between a country's performance and some target performance, typically set at the international level. But those distances are converted into a scale between 0 and 100 so the country with a score of zero did the worst in terms of meeting the target while the country with 100 did the best.

In the rose chart, the circle is divided evenly into nine sectors, each representing a sub-index. The data are encoded in the radius of the sectors. Colors map to the sub-index, and the legend is provided in two ways: a hover-over on the Web, and the table below.

Here is the equation that connects the data (EPI) to the area of the sectors:


There are a number of issues with this representation. First, because of the squaring of the EPI, the area is distorted. If one country is twice the EPI of another, the area is four times as large. Another way to see this is to notice that as the EPI increases, the curved edge of the sector moves outwards, tracing a larger circumference.

Another issue is the one-ninth factor, which implies that each of those nine sub-indices are equally important. The diagram below shows that interpretation to be incorrect. (The nine sub-indices are shown in the second layer from the outside in.)


 A third issue is illustrated in the Singapore rose. Notice from the table below that Singapore scored zero on Fisheries. But in the rose, Fisheries has a non-zero area. Think of this practice as coring an apple. The middle circle of radius k should be ignored. If the sector that has the color of Fisheries has zero area, then the entire red circle shown below should have zero area.


With these three adjustments, the encoding formula becomes rather more complicated:


where x depends on the weight of the sub-index, and k is the radius of the sector that represents value zero.

The rose/radar/spider type charts are more useful when placed side by side to compare countries. But even then, this chart form doesn't work well for this dataset. This is because the spacing of countries within each sub-index is not uniform.

 The site has a visualization of the distribution of sub-index scores by issue:


We can see that in cases of water resources, most countries are not doing very well at all. In terms of air quality, most countries except for those in the right tail have performed quite well. It is hard to interpret the indices unless one has an idea of the full distribution.


Finally, one wrinkle that the EPI people did makes me happy. They have created PDF and images of their data visualization so it is quite easy to save and keep some of this work. All too often, browser-based technologies create visualization that can't be saved.

Mosquito, shoebox, and an ingenious apartment design

First, I saw Alberto tweet his design for the Wall Street Journal (below is the English version):


The yellow space is the size of the smallest "livable" apartment in Hong Kong, known as the "mosquito" apartment. Livability is defined by the real estate developers.

If you've lived in a tropical area like Hong Kong, you'll understand the obsession with mosquitoes. The itching for days! The sneaky little things that suck your blood!

In Manhattan, it seems like we prefer saying the shoebox apartment. By comparison, it's not that scary. It's larger in size too.

The graphic is fantastic as it offers various comparisons of everyday spaces, like a NYC parking space and a basketball court, for which many Americans have some sense of their proportion.


This chart leads me down an unexpected path. I found a set of very powerful photos, commissioned by a humanitarian association in Hong Kong. Overwhelming. Here's one:


Yes, that is the entire living space for this family. All of forty square feet.

This article describes the project, as well as links to a number of other equally astounding photos.

These photos are unfair competition for any graphic designer.


Finally, I came across an inspiring, ingenious design. Gary Chang, who is an architect in Hong Kong, created his own apartment (344 square feet, almost 10 times larger than that in the photo, and twice as large as the mosquito apartment) in this amazing, space-saving design.

Through a series of movable walls, and beds, his apartment can be configured in 24 different ways. This is a small multiples layout!


Here is an article about his achievement, together with a video tour of his home. Not to be missed. It defines making something out of nothing.

Here is a little graphic describing certain transformations:


Here is a different video on Vimeo. And another.

Shaking up expectations for pension benefits

Ted Ballachine wrote me about his website Pension360 pointing me to a recent attempt at visualizing pension benefits in various retirement systems in the state of Illinois. The link to the blog post is here.

One of the things they did right is to start with an extended guide to reading the chart. This type of thing should be done more often. Here is the top part of this section.


It turns out that the reading guide is vital for this visualization! The reason is that they made some decisions that shake up our expectations.

For example, darker colors usually mean more but here they mean less.

Similarly, a person's service increases as you go down the vertical axis, not up.

I have recommended that they switch those since there doesn't seem to be a strong reason to change those conventions.


This display facilitates comparing the structure of different retirement systems. For example, I have placed next to each other the images for the Illinois Teacher's Retirement System (blue), and the Chicago Teacher's Pension Fund (black).


It is immediately clear that the Chicago system is miserly. The light gray parts extend only to half of the width compared to the blue cells in the top chart. The fact that the annual payout grows somewhat linearly as the years of service increase makes sense.

What doesn't make sense to me, in the blue chart, is the extreme variance in the annual payout for the beneficiary with "average" tenure of about 35 years. If you look at all of the charts, there are several examples of retirement systems in which employees with similar tenure have payouts that differ by an order of magnitude. Can someone explain that?


One consideration for those who make heatmaps using conditional formatting in Excel.

These charts code the count of people in the shades of colors. The reference population is the entire table. This is actually not the only way to code the data. This way of coding it prevents us from understanding the "sparsely populated" regions of the heatmap.

Look at any of the pension charts. Darkness reigns at the bottom of each one, in the rows for people with 50 or 60 years of service. This is because there are few such employees (relative to the total population). An alternative is to color code each row separately. Then you have surfaced the distribution of benefits within each tenure group. (The trade-off is the revised chart no longer tells the reader how service years are distributed.)

Excel's conditional formatting procedure is terrible. It does not remember how you code the colors. It is almost guaranteed that the next time you go back and look at your heatmap, you can't recall whether you did this row by row, column by column, or the entire table at once. And if you coded it cell by cell, my condolences.

Tricky boy William

Last week, I was quite bothered by this chart I produced using the Baby Name Voyager tool.


According to this chart, William has drastically declined in popularity over time. The name was 7 times more popular back in the 1880s compared to the 2010s. And yet, when I hovered over the chart, the rank of William in 2013 was 3. Apparently, William was the 3rd most popular boy name in 2013.

I wrote the nice people at the website and asked if there might be a data quality issue, and their response was:

The data in our Name Voyager tool is correct. While it may be puzzling, there are definitely less Williams in the recent years than there were in the past (1880s). Although the name is still widely popular, there are plenty of other baby names that parents are using. In the past, there were a limited amount of names that parents would choose, therefore more children had the same name.

What bothered me was that the rate has declined drastically while the number of births was increasing. So, I was expecting William to drop in rank as well. But their explanation makes a lot of sense: if there is a much wider spread of names in recent times, the rank could indeed remain top. It was very nice of them to respond.


There are three ways to present this data series, as shown below. One can show the raw counts of William babies (orange line). One can show the popularity against total births (what Baby Name Wizard shows, blue line). One can show the rank of William relative to all other male baby names (green line). Consider how different these three lines look!


The rate metric (per million births) adjusts for growth in total births. But the blue line is difficult to interpret in the face of the orange line. In the period 1900 to 1950, the actual number of William babies went up but the blue line came down. The rank is also tough especially in the 1970-2000 period when it took a dive, a trend not visible in either the raw counts or the adjusted counts.

Adding to the difficulty is the use of the per-million metric. In the following chart, I show three different scales for popularity: per million, per 100,000, and per 100 (i.e. proportion). The raw count is shown up top.


All three blue lines are essentially the same but how readers interpret the scales is quite another matter. The per-million births metric is the worst of the lot. The chart shows values in the 20,000-25,000 range in the 1910s but the actual number of William babies was below 20,000 for a number of years. Switching to per-100K helps but in this case, using the standard proportion (the bottom chart) is more natural.


The following scatter plot shows the strange relationship between the rate of births and the rank over time for Williams babies.


Up to 1990s, there is an intuitive relationship: as the proportion of Williams among male babies declined, so did the rank of William. Then in the 1990s and beyond, the relationship flipped. The proportion of Williams among male babies continued to drop but the rank of William actually recovered!