Nuke this bubble chart
Review: Curve Ball 2

Continuing the book review.  The reader who sent me the book noted that the authors used a similar technique to the one I used to study whether suicide spots on the Golden Gate Bridge were random (see here, here, here, here and here).  They used it to study whether Todd Zeile was really a "streaky" hitter or not.

In the following set of charts, Zeile's batting average in the first half of a season was compared against eight fictional hitters.  These fictional hitters were simulated to have two hitting "states" (hot, cold).  At each fictitious game, the hitters were assigned one of the two states with some probability.  The authors asked whether Zeile's batting pattern was similar to those of streaky hitters.  (Conventional sportscaster wisdom says he is streaky.)


Here, we want to know whether the graphs are similar; in my graphs of suicide locations, I asked whether the actual data is different from random.  In general, I find it easier to see differences than similarities.  In both cases, it is not sufficient to visually inspect these charts.  We must use some tests (possibly statistical tests) to help confirm our intuition.  The authors picked these metrics: Max - Min, Number of long streaks (8 or longer), Number of runs, Number of 0-hit games and Number of 3+ hit games.  (A "run" is a string of consecutive 0-hit games or consecutive games with at least 1 hit.)  These are all measures of dispersion or extreme values.

My first review can be found here.


