« Relative relative indices | Main | Baseball ROI 2: scatter plots »

Aug 29, 2005

Baseball ROI: tables or graphs?

David Leonhardt re-opened the debate about whether high-spending baseball teams (like the Yankees) are winners or losers.  According to his application of an idea from Doug Pappas, George surely fools his investors!  Accompanying his article was a table of numbers, of which I clipped the top third:
 
Nytbballtable_1As tables go, this one is fundamentally sound, teams sorted by "cost per victory" which was the point David wanted to make.

If some readers find this table hard to swallow, they probably have wandered off, trying to make sense of the payroll and winning percentage columns; or perhaps they got dizzy trying to get their heads around 1,133,807 versus 1,225,575.  Precision is a great scientific virtue but rarely makes a good graphic guideline.

This set of data, essentially a bi-variate series, gives me yet another opportunity to discuss the versatile scatter plot.  Here is the basic design, with winning % on the y-axis and payroll on the x-axis.  Contrary to the article's conclusion, there appears to be a general association between payroll and winningness.  The dotted lines are median payroll (US$ 63 million) and median winning % (0.500) respectively so that half the teams fall on either side of each line.  I have removed the Yankees since its spending far outstripped every other team (will return to them later).

Redobball0

We can take this design a step further by standardizing both variables: in the new graph, the scales are in units of standard deviations (s.d.) so that 0 is the mean payroll and +1 is payroll that is one s.d. above the mean and so on.  Observe that the Yankees payroll of US$ 206 million is four s.d. above the mean payroll.

Redobball2a_3

Notice the rectangle above.  These are what I call "middle market teams", their payroll within 1 s.d. of the mean, ranging from US$ 39 to 107 million.  Plotting them separately from the Big/Small Spenders gives us a much richer picture of what is occurrring in baseball today.

 
Redobball2b

On the left, the 25 middle market teams are almost equally distributed among the four quadrants (about 6-7 teams in each), showing possibly payroll having nothing to do with winning.  However, extravagant teams (Yankees, Red Sox) always are winners and miserly teams (Pittsburgh, Kansas City, Tampa Bay) always are losers, the inevitability starkly revealed on the right.  (Admittedly, these sample sizes are small.)
 

Scatter plots reveal many more insights than tables of numbers.  Any table must be sorted in one given dimension, and such ordering causes difficulty in understanding other variables listed in the same table.  In a scatter plot, both variables are accorded equal status and the reader decides where to place her attention.

Further, a third variable can be layered on top of a scatter plot.  In the next post, I will address the question of whether East Coast or West Coast management have done better with their money.  What do you think the data will show?


Reference: "Passing on Blue-Chip Players can Pay Off", New York Times, Aug 28, 2005.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/t/trackback/452792/3086642

Listed below are links to weblogs that reference Baseball ROI: tables or graphs?:

» Are pricy footballers worth the cost? from Martin Stabe
The FT today has a story about research by a Cass Business School professor who questions the high cost of footballers’ transfer fees does not make financial sense to Premiership clubs. Gilad Livne argues that new accounting rules mean that... [Read More]

Comments

VERY nice. I'm going to assign this as reading to my freshmen.

I especially liked the "big picture" graphic with the middle market teams boxed*; this very quickly draws attention to Chicago and St Louis, who have apparently figured out how to maximize ROI.

*What graphics package are you using? Either it's pretty versatile, or you're doing some very clever tricks to get the calibration lines and boxes.

Looks to me those graphics were generated with R (the dollar signs in the variable names give it away).

You are doing great work on this blog!

John is right. R is an amazing tool and not just for graphing. It requires a basic understanding of programming. Almost anything you can visualize in your mind, you can create using R. I will likely write a post on R in the future.

The link where the software is freely downloadable is listed under my "Sites of Interest".

Post a comment

Mentions


  • My Amazon.com Wish List

  • Yahoo! Picks

Search Junk Charts


  • Custom Search

Residues

July 2008

Sun Mon Tue Wed Thu Fri Sat
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31