An uninformative end state

This chart cited by ZeroHedge feels like a parody. It's a bar chart that doesn't utilize the length of bars. It's a dot plot that doesn't utilize the position of dots. The range of commute times (between city centers and airports) from 18 to 111 minutes is compressed into red/yellow/green levels.

20141124_Air4

ZeroHedge got this from Bloomberg Businessweek, which has a data visualization group so this seems strange. The project called "The Airport Frustration Index" is here.

It turns out the above chart is a byproduct of interactivity. The designer illustrates the passage of time by letting lines run across the page. The imagery is that of a horse race. This experiment reminds me of the audible chart by New York Times (link).

The trick works better when the scale is in seconds, thus real time, as in the NYT chart. On the Businessweek chart, three different scales are simultaneously in motion: real time, elapsed time of the interactive element, and length of the line. Take any two airports: the amount of elapsed time between one "horse" and the other "horse" reaching the right side is not equal to the extra time needed but a fraction of it--obviously, the designer can't have readers wait, say, 10 minutes if that was the real difference in commute times!

Besides, the interactive component is responsible for the uninformative end state shown above.

***

Now, let's take a spin around the Trifecta Checkup. The question being asked is how "painful" is the commute from the city center to the airport. The data used:

Bw_commuteairport_def

Here are some issues about the data worth spending a moment of your time:

In Chapter 1 of Numbers Rule Your World (link), I review some key concepts in analyzing waiting times. The most important concept is the psychology of waiting time. Specifically, not all waiting time is created equal. Some minutes are just more painful than others.

As a simple example, there are two main reasons why Google Maps say it takes longer to get to Airport A than Airport B--distance between the city center and the airport; and congestion on the roads. If in getting to A, the car is constantly moving while in getting to B, half of the time is spent stuck in jams, then the average commuter considers the commute to B much more painful even if the two trips take the same number of physical minutes.

Thus, it is not clear that Google driving time is the right way to measure pain. One quick but incomplete fix is to introduce distance into the metric, which means looking at speed rather than time.

Another consideration is whether the "center" of all business trips coincides with the city center. In New York, for instance, I'm not sure what should be considered the "city center". If all five boroughs are considered, I heard that the geographical center is in Brooklyn. If I type "New York, NY" into Google Maps, it shows up at the World Trade Center. During rush hour, the 111 minutes for JFK would be underestimated for most commuters who are located above Canal Street.

I'd consider this effort a Type DV.

 


Hedge-fund bubbles are not nice

Reader Sushil B. offers this chart from Business Week on hedge fund returns. (link)

Bw_hedgefunds

Unmoored bubbles, slanted text, positive and negative returns undifferentiated, bubble within bubble, paired data scattered apart, and it's not even that attractive.

 

Here is a Bumps-chart style version of this data:

  Redo_bwhedgefunds

The author never explained how the five funds were chosen so it's hard to know what's the point of the chart. It appears like Harbinger Capital Partners had a similar experience as Paulson. In addition, given the potentially huge gyrations from year to year, it's very odd that we are not shown the annual returns between 2007 and 2011... we can't be sure that some of the three other funds suffered a particularly bad year in between the end points shown here.

 


Mind the gap

When comparing two time series, one typically wants to discuss the size of the gap as it changes over time.  This Business Week chart, for example, depicted for readers the expanding gap between intra-day high and low prices of the S&P 500 for 2008.

Bw_SandPHiLow
This chart construct is effective at pointing out large changes but lacks precision in conveying smaller differences, or trends.  It is always a good idea to plot the gap directly, as we will show below.

Redo_SandPHiLow More importantly, a better choice of scale can help a lot.  By focusing exclusively on variability (extreme values), this chart hides the relevant information of the closing prices of the S&P.  A point spread of a 100 points means more when the index is at 800 than at 1200.  In order to capture this, we can divide the point spread by the opening price of that day so we say the gap is one-eighth or one-twelfth of the opening price. 

The junkart version makes both changes.  The top chart fixes the scale, plotting the point spread as a percentage of daily opening prices.  Relative to the original chart, the variability in the front part of 2008 was muted because the index was at higher levels back then. 

The bottom chart plots the gap sizes (lengths of the high-low lines).  It is without doubt that directly plotting the gaps showcases the key message.  The current level of volatility is more than double what occurred at the beginning of the year.

If one wants to illuminate the trend as opposed to daily fluctuations, a further improvement will be using moving averages.

For those interested, shown below is a scatter plot that compares the original point spread and the derived point spread, which shows that the change is not trivial.


Redo_SandPHiLow2 


Reference: "The Market: A Daily Roller Coaster", Business Week, Oct 27 2008.


Oscar diseconomy

OscarBusiness Week dissected the beneficiaries of the Oscar show as shown on the right.  Although this doesn't work well as a data graphic, if thought as a variant on the data table, it is more engaging for readers.

Lets have some fun with the Oscar statue.  First, putting a bar chart next to the statue confirms that the height of the segments (rather than the area) is in proportion to the dollar values (below left).

Tufte, Chambers and others have shown that our eyes react to the areas, not heights.  So next, I estimated the areas but stretched them out into segments of equal width.  Squeezing the entire column back down to the height of the statue, the following chart (below right) puts perceived proportions next to the true proportions, displaying visually the extent of distortion. 

Redo_oscar


































Reference: "News you need to know", Business Week, Jan 28 2008.


Memo to new owner: what's the point?

Software company Siebel's new owner Oracle put out an embarrassing ad this week; I excerpted the bottom half here. 

Adsiebelsm

The headline of the ad (see the full ad here) screams in large, bold, white letters: "BUSINESS IMPACT  COUNTS."

Unfortunately, the chart answers no questions but raises a full bunch:

  • How were these ratings obtained?  Which experts determined the scores?
  • What is the scale?  How much better is a quarter of a circle?
  • What does it mean by "business impact"?  What exactly is being measured?
  • Why are the circles of differing radii?  What do radii signify?
  • Isn't "Adoption" the category that has the largest separation between the two companies?  Why isn't that highlighted?
  • What is the order of the five criteria?  It is not alphabetical, not ranked by either company's category scores, nor by the differences in these scores

Wi-fi nation: a terrific map

Here is one terrific map, courtesy of Ray Vella at Business Week.

Wifinationsm

 

 


This map works on two levels:

  • The red and green dots provide strong visual cues to support the conclusion that Wi-Fi networks are being widely deployed across American cities, except in the mid-west
  • The three shades of brown show the number of networks installed or planned in each state.  Inclusion of such state-level information justifies the printing of state boundaries.  Without plotting state-level information, state boundaries become chartjunk, as in the heat wave maps I previously discussed.

What's more, we can assimilate the city and state levels.  For example, focusing on Texas, we see from the dark brown shade that it is a state with many networks, and then from the dots, we can see further where those networks are.

A few minor improvements can be made:

  • Tell us the upper bound of the legend, name the legend: by changing 10+ to 10-X, the designer not only provides us another piece of data but also harmonizes the presentation with the other two categories. Besides, the legend needs a title
  • Be more friendly to the color-blind: the red-green contrast should be avoided as much as possible.  If a graphic designer is reading the blog, please tell us where we can find studies of color contrasts
  • Use a softer national boundary: the solid black line sticks out against the soft background and it is the least important bit on the map
  • WifihistogramOne would expect the choice of three shades of brown and the intervals used for each shade to be keyed to the frequency histrogram of the number of networks (shown right).  The current division divides the 50 states into groups of 8, 16 and 26.  Are there better divisions?

Finally, most readers will find the number of networks to be a dissatisfactory metric because more populous states will likely install more networks.  A density measure such as networks per person or per household or per unit area would have been more telling.

Reference: "Wi-Fi Nation", Business Week, Aug 1 2005, p.12.