« July 2005 | Main | September 2005 »

Baseball ROI 3: efficiency

We might as well squeeze more juice out of the baseball data.  So far, I have only indirectly touched upon Leonhardt's use of Pappas' efficiency metric, described thus:

Pappas noted that all teams must spend a minimum - now almost $9 million - on their payroll, because of the league's minimum player salary. A team full of players making the minimum would probably win about 30 percent of its games, roughly what the worst teams in baseball history have, he reasoned. Dividing any payroll above $9 million by any victories above the 30 percent threshold produces the cost-per-victory figure.

This calculation assumes that all teams spend an average of $185K per win for the first 48 or so wins (30% of 162 games).  However, this figure is far from reality: a quick glance at the NYT table shows that for each win beyond the first 48, teams spend anywhere from $800K to $4 million. 

The Pappas calculation severely over-estimates how far the first $185K can take a baseball team.  Instead of using $9 million which is the league-mandated minimum spending, one can use the real payroll for the lowest spending team, which is about $30 million (Tampa Bay).  As per this logic, in order to get those 48 wins, no team has spent less than $30 million.  This assumes an average spending of $617K for the first 48 wins.

In the last post, we saw that West coast teams have poor ROI.  Is this reflected by our efficiency metric?  I return to the scatter plot, and now connect the dots to the origin.  The slope of each line is a reflection of efficiency: the steeper the line, the more efficient the team.  The long dotted line represents our base-line efficiency, i.e. 48 wins with $30 million.  It is evident that no West coast teams beat the base-line, confirming our previous observation.


The short dotted line is Pappas base-line of 48 wins using $9 million.  Its slope is much steeper than that of any real team, showing that he severely over-estimated team efficiency.  In reality, if a team were to spend only $9 million (in today's terms), one doubts whether it can get 48 wins.  (Because of the Yankee's extravagance, their efficiency line is by far the flattest and off the charts.)

If efficiency were the only criterion I worry about, this kind of plot would be less than ideal.  It helps visually rank the teams but the reader cannot see the efficiency values.  Also, it is not the length of the line but the slope/angle of the line which is proportional to efficiency.

Baseball ROI 2: scatter plots

What more can one do with scatter plots?  Much more, it turns out.  In the last post, I compared middle-market baseball teams to big/small spending teams.  There are many other ways to group the 30 teams, for example, by league (American, National), by region (West, Central, East) and by division (NL West, AL East, etc.).

When presented in a table, this information is hidden.  On a scatter plot, such comparisons are easily visualized by judicious use of colors and/or labels.

The same pattern appears in both leagues although the payroll extremes occur in the American League (top right).  The payroll disparity is widest among East coast teams and smallest among West cost teams (bottom left).

While the overall pattern, at top left, is one of higher payroll, more victories, the bottom left graph shows that this overall pattern is only observed among teams in Central.  The winning percentages of East and West coast teams appear rather flat across a wide range of payroll.

The graph by division (bottom right) further muddies the picture.  NL East teams all have above .500 records regardless of whether their payroll is above or below median.  The opposite is true for NL West teams.  AL West has one team in each of the four quadrants, and to a lesser extent, the same for AL Central and AL East.  Thus, the strongest evidence of a link between payroll and winningness is among NL Central teams.

In the following set of graphs, I extracted the middle-market teams, and then plotted each region's teams on a separate graph, facilitating comparison by region.  The scales are standardized as in the last post.


With few samples in each group, it is hard to make general statements but overall, the link between payroll and winning percentage is weak among the middle-market teams, regardless of where they are located.

If we remove  Colorado and  LA Angels from the West coast teams chart (top right), we will uncover disturbing news!  The other seven teams together paint a bleak picture of West coast management: the more they spend, the more they lose.

Baseball ROI: tables or graphs?

David Leonhardt re-opened the debate about whether high-spending baseball teams (like the Yankees) are winners or losers.  According to his application of an idea from Doug Pappas, George surely fools his investors!  Accompanying his article was a table of numbers, of which I clipped the top third:
Nytbballtable_1As tables go, this one is fundamentally sound, teams sorted by "cost per victory" which was the point David wanted to make.

If some readers find this table hard to swallow, they probably have wandered off, trying to make sense of the payroll and winning percentage columns; or perhaps they got dizzy trying to get their heads around 1,133,807 versus 1,225,575.  Precision is a great scientific virtue but rarely makes a good graphic guideline.

This set of data, essentially a bi-variate series, gives me yet another opportunity to discuss the versatile scatter plot.  Here is the basic design, with winning % on the y-axis and payroll on the x-axis.  Contrary to the article's conclusion, there appears to be a general association between payroll and winningness.  The dotted lines are median payroll (US$ 63 million) and median winning % (0.500) respectively so that half the teams fall on either side of each line.  I have removed the Yankees since its spending far outstripped every other team (will return to them later).


We can take this design a step further by standardizing both variables: in the new graph, the scales are in units of standard deviations (s.d.) so that 0 is the mean payroll and +1 is payroll that is one s.d. above the mean and so on.  Observe that the Yankees payroll of US$ 206 million is four s.d. above the mean payroll.


Notice the rectangle above.  These are what I call "middle market teams", their payroll within 1 s.d. of the mean, ranging from US$ 39 to 107 million.  Plotting them separately from the Big/Small Spenders gives us a much richer picture of what is occurrring in baseball today.


On the left, the 25 middle market teams are almost equally distributed among the four quadrants (about 6-7 teams in each), showing possibly payroll having nothing to do with winning.  However, extravagant teams (Yankees, Red Sox) always are winners and miserly teams (Pittsburgh, Kansas City, Tampa Bay) always are losers, the inevitability starkly revealed on the right.  (Admittedly, these sample sizes are small.)

Scatter plots reveal many more insights than tables of numbers.  Any table must be sorted in one given dimension, and such ordering causes difficulty in understanding other variables listed in the same table.  In a scatter plot, both variables are accorded equal status and the reader decides where to place her attention.

Further, a third variable can be layered on top of a scatter plot.  In the next post, I will address the question of whether East Coast or West Coast management have done better with their money.  What do you think the data will show?

Reference: "Passing on Blue-Chip Players can Pay Off", New York Times, Aug 28, 2005.

Relative relative indices

Mahalanobis and I have been communicating over a graph that first appeared in the Economist, reproduced below.  The graph purportedly supports their front-page story on "Germany's surprising economy".  He helped clear up my initial confusion over what was being plotted.Germany_01

The asterisk in the chart was explained as "relative to euro-area average". My confusion is captured thus: when Germany's value decreased from 100 to 95 by mid-2002, was that 5 points over Q1 1999?  or was that 5 points over Euro average?

This double indexing confounds sources of change, potentially misleading readers.  The 5-point decline represented a 5% decrease in Germany's relative cost.  Even if Germany's labor cost stayed flat, this 5% decrease can be entirely due to other Euro countries performing worse.  In fact, Germany's curve would still show a decline when its unit labor costs are rising, so long as other Euro countries suffer worse setbacks.

Below, my junkchart version (on the right) is superior because it separates the two effects, showing both the growth/decline in unit labor costs within each country as well as the labor costs relative to the Euro average in each time period.  The left-side chart uses a double index akin to the Economist version; the only trick I used was removing the second index (so that only the Euro average is 100 in period 1).


The story Mahalanobis and I concocted is now clearly seen: Germany had historically had higher labor cost than France and the EU but has quickly narrowed the gap from period 1 to 3.

More mischief in map coloring

Sam Cook was nice enough to mention Junk Charts on her and Andrew Gelman's blog today.  She provided another example of mischievous use of colors and shading in data maps, showing that sometimes we statisticians are no better than the rest. The effect is quite innocently hilarious and is a must-see!

Instead of assigning each level to a different color/shade, the geographical pattern of Amstat meetings is more clearly revealed if a different shade was used for 0, 1-2, 3-5, 7.  Thus:


Obesity bad, maps good

The recent coverage of obesity in the US media produced at least two very good data maps.  The New York Times printed this snapshot of the nation in 2004.


Because of a judiciously chosen color scheme, we can easily discern the pattern of obesity: more severe between the Lakes and the Gulf; least in the West and Northeast, especially in Colorado; quite bad in the middle and the South. Nyt_obesity_legend_1

The legend is deserving of much praise:   in defiance of popular but simplistic usage, the range was not divided into four equal parts (quartiles); rather, the designer selected four unequal parts so as to reveal the geographical pattern on the map. Besides, the complete range of the data was shown as is, where most would have artificially widened the range to 15.0% on one end and 30.0% on the other.

All in all, this is a simple graphic conveying a clear message.  Well done.

And yet -- the dynamic aspect of obesity growth alarms even more:

  • Within many states, more and more people are becoming obese
  • Nationally, more and more states have high obesity rates

These trends, along with others, are perfectly captured by the following terrific, dynamic data map, thanks to CDC.  It is a wonderful example of how the electronic medium (animated gif) can do wonders for the graphics designer. [You may need to click on the map to see the animation in a pop-up window.]

The time dimension is experienced rather than drawn on paper/screen.  This experience is in fact distorted, compressed time is what we feel, but the distortion improves rather than deter our ability to see trends.

  • The states between the Lakes and the Gulf led the nation throughout this period.
  • The ever expanding legend ingeniously draws attention to the fact that the worst states have gotten worse over time.
  • No single state has been spared: by 2001, only Colorado had an obesity rate below 15%; just 7 years earlier, in 1994, the entire Western half of the U.S. had obesity rates below 15%.

One small gripe: if read quickly, the reader can be forgiven for thinking that "white" indicates 0% obesity.  Not so!  "White" actually means "no data".  I'd prefer to use a neutral color for "no data"; when they started tracking, these states turned out to be no less obese than others.  By 1994, every state has started tracking obesity.

This dynamic map is really rich in information.  Feel free to leave comments about what else strikes you about it.

Reference: "Obesity Rate Is Nearly 25%, Group Said", New York Times, August 24, 2005; CDC Obesity Trends.

Bubble charts and their discontents II

Bubble charts force three dimensions onto a 2-D flat surface.  They are occasionally useful for illustrating concepts but seldom work as a data graphic.  The following chart illustrates some lethal problems:


In bubble charts, two dimensions are plotted in the usual x and y axes (here, longitude and lattitude) while the third dimension is depicted as circular areas.  Like wild dogs, the pair of gigantic bubbles insisted on marking their territories, obscuring many littler bubbles.  At the same time, it gets harder and harder to locate their centers (i.e. recover the other two dimensions) as bubbles expand.Nwa_legend

Further, the standard way of displaying the legend, involving overlapping circles, obstructs our ability to compare the areas effectively.

This chart, however, is data-rich.  Simultaneously, it plots (1) geographical locations (map); (2) passenger volume (area of circle); (3) market share (shading of circle); and (4) top markets (call-out text).  The key to improving readability is to untether oneself from geography.  In other words, give up geographical information and focus on market share versus passenger volume.  For example:


[The graph looks better if it had only blue diamonds.  I had to insert green dots because I don't have the full data set.  The blue diamonds are real data; green dots are approximations, and there should have been many more in roughly the same locations.]

We now see that Northwest's markets fall into three types: large and dominant (national hubs), small and strong (regional hubs), and small and small (others).  There are only a few cities with market share over 50% while the rest are less than 25%.  Similarly, NWA serves fewer than 25,000 passengers in all but three markets.  (A log scale can be used here if one wants to explore further groupings within the small-scale markets.)

Reference: "Well-Laid Plan Kept Northwest Flying Despite the Strike", New York Times, August 22, 2005

Imagination gone astray

In the field of data graphics, the New York Times has distinguished itself as a leader.  The team also takes risks in going beyond the standard and tired repertory of pies, bars and lines.  On this occasion, they have surely let their imagination gone astray.  Consider this stupefyingly opaque chart:


If you can make sense of it, click on comment below to leave your thoughts:

  • how do the graphical elements convey the data?
  • how long did it take to figure out what it means?


Reference: "When a Bug Becomes a Monster", New York Times, August 21 2005.

Bubble charts and their discontents

In a recent post about NYT's blinding spots, I cited a chart showing Wall Street being "more bullish on Walmart than Costco".  Here, I reproduce it, and next to it, I put the table of numbers without the gridlines and bubbles.

In my mind, the bubbles (and gridlines) distract rather than inform.  The use of black versus gray further distorts our perception.  The table on the right is just as good, if not better.

Bubbles are notoriously misleading because of our inability to compare areas as well as the lack of a scale.  For example, how large do you think the bigger circle is? (Answer at the end.)

Also, it is not sufficient to compare the absolute number of buys, holds and sells because the total number of calls are different.  For instance, 10 buys out of 12 calls means something different from 10 buys out of 40 calls!

The key to reading this data is the median: if we line up all analysts with buys to the left and sells to the right, what call did the analyst in the middle make?  Which call the median analyst made depends on both the total number of analysts and the proportion of buys, holds and sells in the sample.  The following chart anchors at the median analyst, stretching out on both sides of that middle point:


We observe that:

  • The median analyst called BUY for Walmart and HOLD for Costco
  • The majority of analysts made the same call as the median in both cases
  • Among the minority, most Walmart analysts called HOLD, which is more pessimistic than the majority
  • Among the minority, most Costco analysts called SELL, which is more pessimistic than the majority
  • More analysts followed Costco than Walmart (longer bar)

[The bigger circle is size 17 compared to size 10.]

Reference: "How Costco Became the Anti-Walmart", New York Times, July 17, 2005


How representative is your sample?

Taking a hint from Mahalanobis, I dug into Howard Wainer's other book  (Visual Revelations) to find the following gem.  Imagine you're an engineer working for the military.  You have the ingenious idea to inspect planes that returned home and plot the pattern of bullet holes.  The dark regions had high density of bullet holes.  Your task is to recommend where to put extra armour on the new planes.  What would you recommend?  (Note: the answer appears after the graphic!)



Howard credited Abraham Wald for his counter-intuitive insight.  We should put extra armour in the white regions, not the dark regions.  The inference is that the planes that got shot in the dark regions managed to return to the base while others got hit presumably in the white regions and never returned.

What has this to do with sampling?  If we forgot about the planes that never came back, we may jump to the conclusion that we should reinforce the dark regions.  The sample we didn't see is as important as the sample we observed.  To wit:


Statisticians call this "survivorship bias".  We only oberve survivors but we must not forget about the non-survivors!

A related page I found on the Web: Steve Simon