« July 2017 | Main | September 2017 »

The salaries are attractive but the chart isn't

The only reason why the IEEE Spectrum magazine editors chose this chart form is because they think they need to deliver precise salary figures to readers.

This chart is just so... sad.

The color scheme is all wrong, the black suggesting a funeral. The printed data occupying at least half of the width of each bar frustrate any attempt to compare lengths. We enter an unusual place where higher numbers appear under smaller numbers. The job titles are regrettably dressed in the same cloth as the median salary bars. It's not clear how the regions are ordered but in any case, it's hard to figure out regional disparities. In reality, no one is getting precisely the listed salaries - rounding up those numbers makes them easier to grasp.

This is a chart that repels rather than attracts readers.


A test of sufficiency immediately nails the problem. When the data set is removed, there is almost nothing to see:



Mid-Atlantic managers are the winners.


Details, details, details: giving Zillow a pie treatment

This chart (shown right), published by Zillow in a report on housing in 2012, looks quite standard, apparently avoiding the worst of Excel defaults.

In real estate, it’s all about location. In dataviz, it’s all about details.

What are some details that I caught my eye on this chart?

Readers have to get over the hurdle that “negative equity” is the same as “underwater homes.” This is not readily understood unless one reads the surrounding text. For example, the first row for the U.S. average proclaims that 31% of U.S. homes are “underwater” and among these underwater homes, 10% of the mortgages are delinquent. The former is concerned with the valuation of the property while the latter deals with payments or lack thereof.

According to the legend, the blue segments stand for the proportions of underwater homes in different metro areas but it’s not quite true – the blue part represents underwater but not delinquent mortgages while the red and blue combined represents all underwater mortgages. This is a common problem in stacked bar charts.

The metro areas are in alphabetical order by city, which means an opportunity is missed to help readers discern patterns. Patterns related to city-name alphabets is not of interest to most (except certain econometrics journal editors). Try arranging by region, or by decreasing level of negative equity, or some other meaningful variable.

The designer tried to do something clever with the horizontal axis labels and I don't think it succeeds. To see what is going on, read the note below the chart. The trick is to let readers look at the number of underwater and delinquent mortgages in two ways, as a proportion of underwater mortgages (through the white data labels) and as a proportion of all mortgages (through the axis labels). That's a mess, sorry to say.

Finally, I like the horizontal axis to extend to 100% because underlying the proportions shown in blue and on the horizontal axis is the population of all mortgages.


Perhaps a shock to many readers. The task of showing underwater delinquent mortgages simultaneously as a proportion of underwater mortgages and as a proportion of all mortgages is solved using .... pie charts.

I just created a couple of examples here:


The deep orange sector can be compared to the entire circle, or to the larger orange sector. Readers usually don't have a problem with pies with only three slices.

Speed demon quartered and shrunk

Reader Richard K. submitted a link to Microsoft Edge's website.

Screen Shot 2017-08-09 at 10.00.08 PM

This chart uses three speedometers to tell the story that Microsoft's Edge browser is faster than Chrome or Firefox. These speedometer charts are disguised racetrack charts. Read last week's post first if you haven't.

Richard complained the visual design distorting the data. How the distortion entered the picture is a long story. Let's begin with an accurate representation of the data:


Next, we pull those speedometer curves straight:


While the three values are within 10 percent of each other, the lengths of the two shorter curves are only 40-50 percent of the length of the longest one! This massive distortion is due to not starting the axis (i.e., speedometer) at zero.

We now put the missing 25,000 back onto the chart, proportionally expanding each bar. As seen below, fixing the axis does not get us back to the desired relative lengths, so some other distorting factor is at play.


The culprit is that the middle speedometer is 44 percent larger than the other two. If we inflate the side bars by 44 percent, the world is made right again. Phew!





Starstruck and doubled over: losing poise over Indian charts

Twitter follower @ashwink_s didn't see eye-to-eye with the following charts that appeared in an Indian publication.

There is the infamous racetrack chart:


In the racetrack chart, the designer has embedded data in the angles at the center of the concentric circles but the visual cues point to the arc lengths. If the same proportion of people voted Yes as voted No, the two arcs should look like this:


The length of the red arc is much larger than the length of the gray arc, even though they encode the same value. There is no reason to double over, just pull them back straight pronto!


Next, we have a busy chart:


We are starstruck.

All those stars are redundant as they just illustrate the rating numbers printed to their left. The story here is that the government received a 7.5 rating, with no one rating it below 4, and the majority giving a 7 or 8. (It's curious that no one at all rated the government below 4. In most rating polls that I've come across, primarily in the U.S., there are extreme views.)

After the makeover:



 P.S. Thanks to Matt F. who noticed the switched bars in the original post, and messaged me. The chart has now been fixed.


Visualizing electoral college politics: exercise in displaying relationships between variables

Reader Berry B. sent in a tip quite some months ago that I just pulled out of my inbox. He really liked the Washington Post's visualization of the electoral college in the Presidential election. (link)

One of the strengths of this project is the analysis that went on behind the visualization. The authors point out that there are three variables at play: the population of each state, the votes casted by state, and the number of electoral votes by state. A side-by-side comparison of the two tile maps gives a perspective of the story:


The under/over representation of electoral votes is much less pronounced if we take into account the propensity to vote. With three metrics at play, there is quite a bit going on. On these maps, orange and blue are used to indicate the direction of difference. Then the shade of the color codes the degree of difference, which was classified into severe versus slight (but only for one direction). Finally, solid squares are used for the comparison with population, and square outlines are for comparison with votes cast.

Pick Florida (FL) for example. On the left side, we have a solid, dark orange square while on the right, we have a square outline in dark orange. From that, we are asked to match the dark orange with the dark orange and to contrast the solid versus the outline. It works to some extent but the required effort seems more than desirable.


I'd like to make it easier for readers to see the interplay between all three metrics.

In the following effort, I ditch the map aesthetic, and focus on three transformed measures: share of population, share of popular vote, and share of electoral vote. The share of popular vote is a re-interpretation of what Washington Post calls "votes cast".

The information is best presented by grouping states that behaved similarly. The two most interesting subgroups are the large states like Texas and California where the residents loudly complained that their voice was suppressed by the electoral vote allocation but in fact, the allocated electoral votes were not far from their share of the popular vote! By contrast, Floridians had a more legitimate reason to gripe since their share of the popular vote much exceeded their share of the electoral vote. This pattern also persisted throughout the battleground states.


The hardest part of this design is making the legend: