Earlier in the month, Prof. Gelman linked to Brandon's fascinating analysis of on-line weather forecasting accuracy. I have done some additional analysis of the data and the result can be visualized as follows.
I'll concentrate my comments on three observations:
- CNN was the clear winner in forecasting accuracy during this period based on two criteria: its median error in forecasting daily lows, and its median error in forecasting daily highs. Moreover, both the median errors were zero, which gives us confidence about its accuracy. The Weather Channel (TWC) and Intellicast (INT) were not far behind.
- The ability to forecast highs was better across the board than that of forecasting lows (except BBC). I am not sure why this should be so.
- Overall, our weather forecasters were much too risk-averse. Notice that the errors were heavily biased in the lower left quadrant. A negative error on low temperatures means predicted low is higher than actual low; a negative error on high temperatures means predicted high is lower than actual high. Taking these together, we observe that the range of actual temperatures have generally been larger than the range of predicted temperatures! No one was willing to go out on a limb, so to speak, to forecast extremes.
Actually, I believe this inability or unwillingness to forecast extreme values is endemic to all forecasting methodologies.
Before closing, I mention that the graph was based on a subset of Brandon's data. I only considered same-day forecasts, did not consider Unisys (because they didn't provide forecasts for lows), and also noted that there might be bias since there were breaks in the time series. Also, I retained the sign information and didn't take absolute values as Brandon did.