Verification of Spatial Forecasts

 

Beth Ebert

Bureau of Meteorology Research Centre, Melbourne, Australia

 

 

Verification of spatial forecasts (rainfall or temperature fields, for example) attempts to answer the fundamental questions, How good is this forecast, and does it look right? Viewing maps of the forecast and observed fields side by side is one approach. Quantifying the accuracy of the forecast requires objective verification techniques.

 

There are many verification approaches that give different pieces of information about the quality of the spatial forecast. Commonly used error measures such as the mean difference, mean absolute error, and root mean square error measure the average bias and error magnitude. The correlation coefficient measures the correspondence between the forecast and observed spatial patterns, regardless of bias. Categorical statistics based on the event / no event contingency table, such as bias score, probability of detection, false alarm ratio, and equitable threat score, measure the extent to which forecast events (rain occurrence, for example) correspond to observed events. The S1 skill score is often used in NWP to measure the skill of forecast pressure patterns. While these scores all provide objective measures of forecast skill, they may not clearly reveal what is going wrong in the forecast.

 

In an effort to make the verification more intuitive, pattern recognition methods have recently been developed that determine the displacement, shape, and intensity errors of the forecast field. The total error can be decomposed into these three components, giving hints as to whether the errors are due mainly to dynamical or physical processes in the forecast model.

 

The "standard" scores have served the meteorological community well for many years in quantifying the skill and improvement in the prediction of large-scale fields. However, as models go to finer and finer resolution, and radar and satellite data become increasingly available for fine scale verification, the standard scores become less useful. A forecast for mesoscale convection that is clearly useful to a forecaster may verify quite poorly because the predicted storms did not line up well with the observations. A smooth field almost always scores better than a variable one that qualitatively looks more realistic. The observations themselves may have large uncertainties due to sampling and measurement errors. Probabilistic and spectral verification approaches may be more appropriate for these "messy" situations.