Tukey's Box Plots
Aesthetics and function

Bar in a bar

I have been meaning to comment on the bar-in-a-bar chart for a while, and have finally found a good example.  This type of chart figures prominently in NYT but is generally inferior to a dot plot or an interval plot.

NytnjschoolsThe article dealt with the alarming finding that superintendents in New Jersey school districts may have under-reported their total compensation to the Department of Education.  Depicted in the chart are a set of 12 "paired" differences, each comprising a pair of numbers, the reported salary and the actual salary.

In the bar-in-a-bar plot, one data series is drawn as fat gray bars while the other data series uses thin black bars superimposed on the gray bars.  Aside from its ugliness, this chart also distorts our perception of the data as the area of a bar is no longer proportional to the salary number.

Worse still, this form takes our attention away from the key statistic, that being the gap between reported and actual salary.  The interval plot below remedies these problems; it also adopts a more reasonable ordering, by the size of the salary gap.

Redonjschools1

Presented in this way, the chart draws attention to the phenomenon that the higher the reported pay, the larger the pay gap.  The following scatter plot takes up this topic by plotting the salary gap as a percentage of the reported salary against the reported salary.

Redonjschools2

Reference: "Leading New Jersey's Schools Has Its Price: High", New York Times, March 14, 2006.

Comments

rif

You state that your chart shows that "the higher the reported pay, the larger the gap." You really think so? The interval chart is organized by gap size. Looking at both plots, it looks to me like there're a couple outliers (Bergen and Tom's River) who have the highest pay and highest gap, but other than that there's not a clear trend. I'm just not quite sure I buy it.

I'd guess larger gap and actual salary are correlated, but you'd see that even if everyone's reported salaries were identical.

Kaiser

It's difficult to see it in the interval chart; that's why I also did a scatter chart. There, I used percentage difference, not actual difference, thereby controlling for reported salary. Obviously, the sample size is very small but the positive correlation is quite visible, to my eyes.

runescape gold

I'd guess larger gap and actual salary are correlated, but you'd see that even if everyone's reported salaries were identical.

The comments to this entry are closed.