Chart of the day improved, needs better data
Those prickly eyebrows

Rebirth of the twin towers

Perhaps it's this week's anniversary of the WTC disaster. Perhaps it's the New York-centric viewpoint of Citibank. One wonders what inspired Citibank analysts to make this absurdity.

Citi-vehicle-density-survey

(Via Business Insider.)

First, we must fix the vertical scale. For column charts, one must start at zero, without exceptions. The effect of not starting at zero is to chop off an equal length piece from the bottom of each column, and in doing so, the relative lengths/areas of the columns are distorted. The amount of distortion can be very severe. For example look at the fourth set of columns as shown below:

Redo_citi1
 

In both charts, I made the length of the first column the same so we are staring at comparative charts. The data plotted is exactly the same; the only difference is that the left chart starts the axis at zero. Notice that the huge difference seen on the right chart for the 4th pair of columns does not appear as extraordinary when the proper scale is used.

A multitude of other problems exist, not the least this is a chart that is highly redundant. The same data (10 numbers) show up three times, once as data labels, once as column lengths (distorted), and once as levels on the vertical scale.

***

An alternative way to look at this data is the Bumps chart. Like this:

Redo_citicar2

What this chart brings out is the variability of the estimated vehicle densities. In theory, the density estimate should be quite accurate for the "today" numbers. You'd think that in surveying 2,000+ people about how many vehicles they currently own, most people should be able to provide accurate counts.

The data paint a different picture. From quarter to quarter, the estimated "today" density shows a range of 1.90x to 2.00x in the 5 periods analyzed, which is roughly 5%, a difference which, according to the analyst, equates to 5 million vehicles!  Given current vehicle sales of about 13 million per year, 5 million is almost 40% of the market.

So, one wonders how this survey was done, and one wants to know how large is the margin of error of this estimate. I also want to know if the survey produces estimates of number of households as well since the vehicle per household metric has two variable components.

Comments

Jon Peltier

Two points.

1. The original chart's data spacing is irregular, but the pairs of columns are equidistant. I would have considered a timeline with two series.

2. The term "bumps chart" is misused. A proper bumps chart uses ranks, not values, on the vertical. Tufte described this chart type frequently in his books, but only recently coined the phrase "slope chart", which is pretty descriptive.

Rick Wicklin

In every instance, the "predicted" values are less than the "current" values, so you could also overlay two bar charts: the current in the background and the predicted in the foreground.

Floormaster Squeeze

I also had the same thought as Jon P. above. I would show two series in time (to also help see the accuracy of the "forecast").

TBW

Why doesn't the vertical scale start with zero on the Bumps chart ? It seems like the scale on the vertical axis is exaggerating the variability in the numbers, as well as the change between the current and predicted numbers.

andy

you ask, perhaps rhetorically, who needs this information... seems like a good bet that the Citi analyst was addressing an investment in a local parking garage company.

The comments to this entry are closed.