« Going out on a limb | Main | Information gain and loss »



I think you misinterpreted my remark - I'll try and explain myself better: One possible predictor for the temperature is simply to use the average temperature. This is obviously a pretty bad predictor, as it doesn't take into account any extra information we have about a particular day. However, the mean (or median error) of this predictor will be very close to 0 (exactly 0 if we know the true mean), meaning that it would appear to be a good predictor on this scale. Or have I missed something?


I did misinterpret your original comment. I'd add a nuance to your new comment... these forecasters are predicting the highs and lows for each day, which means they are predicting the extreme values of a distribution, rather than the central values.

It's an intriguing problem: the historical time series of high temperatures are all extreme values; what would be the best way to derive a prediction from such a series? I reckon the within variation is so high that the strategy of predicting the average fo the time series would not work that well.


There are other ways of changing the weight of extremes in the mean than the median ; such as geometric, harmonic or logarithmic means... What about them ?


The mean or median only determines the bias of the predictor, what is important is the variance of the error, as Hadley pointed out using the mean minimum as the predictor will have zero bias but is a really poor predictor. It is quite possible that a biased predictor has a lower error variance, but for a minimum this would be likely a positive bias. The reason for a bias toward low values probably is due to the greater consequences of unexpectedly low temperatures causing safety problems.

That we are estimating the extreme of a distributiion is irrelevant, the extreme has it's own distribution, after all it is just a series of numbers.

I would look at two things, as well as the bias, the standard deviation of the errors, as this also includes the bias, and the frequency of prediction different from the true value by some number of degrees.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.


Post a comment

Your Information

(Name is required. Email address will not be displayed with the comment.)


Link to Principal Analytics Prep

See our curriculum, instructors. Apply.
Marketing analytics and data visualization expert. Author and Speaker. Currently at Columbia. See my full bio.

Book Blog

Link to junkcharts

Graphics design by Amanda Lee

The Read

Good Books

Keep in Touch

follow me on Twitter