Statistical Power – A Neglected Topic in Forecast Verification?

 

Ian Jolliffe

University of Aberdeen

 

and

 

Jacqueline Potts

Biomathematics and Statistics Scotland

 

There are many measures of the accuracy or skill of a forecast or set of forecasts. Values of such forecasts are often quoted or compared without accompanying information on the statistical significance of the value relative to a baseline, or of the difference between values for two or more forecasts. Some progress can and has been made in evaluating such significance, although temporal and spatial correlation often makes this a non-trivial task. The idea behind statistical significance testing is to guard against false claims of good performance of forecasts. The other side of the coin is the power of measures to detect changes in performance of forecasts relative to a baseline when real improvements have been achieved. Very little has been done on this subject. Here we present some results comparing the power of standard performance measures (weighted and unweighted mean square error, correlation and anomaly correlation) for spatial data.