« Going out on a limb | Main | Information gain and loss »



I think you misinterpreted my remark - I'll try and explain myself better: One possible predictor for the temperature is simply to use the average temperature. This is obviously a pretty bad predictor, as it doesn't take into account any extra information we have about a particular day. However, the mean (or median error) of this predictor will be very close to 0 (exactly 0 if we know the true mean), meaning that it would appear to be a good predictor on this scale. Or have I missed something?


I did misinterpret your original comment. I'd add a nuance to your new comment... these forecasters are predicting the highs and lows for each day, which means they are predicting the extreme values of a distribution, rather than the central values.

It's an intriguing problem: the historical time series of high temperatures are all extreme values; what would be the best way to derive a prediction from such a series? I reckon the within variation is so high that the strategy of predicting the average fo the time series would not work that well.


There are other ways of changing the weight of extremes in the mean than the median ; such as geometric, harmonic or logarithmic means... What about them ?


The mean or median only determines the bias of the predictor, what is important is the variance of the error, as Hadley pointed out using the mean minimum as the predictor will have zero bias but is a really poor predictor. It is quite possible that a biased predictor has a lower error variance, but for a minimum this would be likely a positive bias. The reason for a bias toward low values probably is due to the greater consequences of unexpectedly low temperatures causing safety problems.

That we are estimating the extreme of a distributiion is irrelevant, the extreme has it's own distribution, after all it is just a series of numbers.

I would look at two things, as well as the bias, the standard deviation of the errors, as this also includes the bias, and the frequency of prediction different from the true value by some number of degrees.

The comments to this entry are closed.

Kaiser Fung. Business analytics and data visualization expert. Author and Speaker.
Visit my website. Follow my Twitter. See my articles at Daily Beast, 538, HBR.

See my Youtube and Flickr.

Book Blog

Link to junkcharts

Graphics design by Amanda Lee

The Read

Keep in Touch

follow me on Twitter