Statistical
Power – A Neglected Topic in Forecast Verification?
Ian
Jolliffe
University
of Aberdeen
and
Jacqueline
Potts
Biomathematics
and Statistics Scotland
There are many measures of the accuracy or skill of a
forecast or set of forecasts. Values of such forecasts are often quoted or
compared without accompanying information on the statistical significance of
the value relative to a baseline, or of the difference between values for two
or more forecasts. Some progress can and has been made in evaluating such
significance, although temporal and spatial correlation often makes this a
non-trivial task. The idea behind statistical significance testing is to guard
against false claims of good performance of forecasts. The other side of the
coin is the power of measures to detect changes in performance of forecasts
relative to a baseline when real improvements have been achieved. Very little
has been done on this subject. Here we present some results comparing the power
of standard performance measures (weighted and unweighted mean square error,
correlation and anomaly correlation) for spatial data.