M.
Statistics applications and forecast verification
[Background]
[Workshop]
[New
verification approaches for precipitation and convective weather]
[Detecting
inhomogeneities in precipitation observations]
1.
Background
The
RAP Verification Group continued to provide ongoing independent verification
of improved forecasting systems for aviation weather developed NCAR
and other laboratories. RAP works closely with other verification
groups [e.g., the Real-Time Verification System group at NOAA's Forecast
Systems Laboratory (FSL)] to evaluate the forecasting capabilities
of experimental products and products being considered for operational
use. A major study in 2002 involved evaluation of the Integrated Turbulence
Forecasting Algorithm, which is going through the NWS and FAA approval
process.
Because aviation
weather forecasts are presented in varying formats and frequencies,
and because the phenomena of concern (e.g., icing, turbulence) can
be difficult to observe, the Verification Group put a great deal of
effort toward developing methods and learning how to use the available
observations appropriately. In addition to these concerns, it often
is difficult to find meaningful verification measures that provide
useful information for forecast users and developers. Many of the
verification issues are pervasive in meteorological forecasting and
have become more important as forecast grids have become finer in
scale. To help cope with these issues, the RAP Verification Group
co-hosted a workshop on verification, titled "Making Verification
More Meaningful," at which many of these issues were discussed.
In addition, the Verification Group continues to work on development
of improved verification approaches for convective and precipitation
forecasts. The workshop and development of an object-based verification
approach for convective/precipitation forecasts are the subjects of
the following two subsections.
B.
Brown and T. Fowler also are involved in a different type of application
of statistics to atmospheric sciences,
in collaboration with E. Tollerud at FSL. Their study concerns development
of an approach to identify changes (i.e., inhomogeneities) in precipitation
observations. Because daily observations are used in many applications
(some of which are economically-sensitive), it is important to alert
users if characteristics of the observations change due to an un-reported
change in station location, growth of vegetation around the precipitation
gauge, or some other factor. The third sub-section considers ongoing
work on this study.
2.
Workshop on "Making Verification More Meaningful"
This
workshop was conceived as a way to bring together verification experts
who have similar problems (e.g., difficult observations, gridded forecasts,
a need for operationally-meaningful metrics), to provide opportunities
for discussion of specific issues, and to develop new collaborations.
The workshop was organized by Barbara Brown, Tressa Fowler, and Agnes
Takacs, all of RAP, in collaboration with Jennifer Mahoney of FSL.
In addition, members of the RAP verification Group (J. Braid, R. Bullock,
and M. Chapman) and RAP administrative staff (I. Gallo and C. Park)
facilitated the workshop preparations and event.

Figure 1. Attendees at the Workshop on Making
Verification More Meaningful.
The
workshop proved to be more popular than anticipated, with approximately
90 participants, including meteorologists, hydrologists, statisticians,
mathematicians, researchers, and operational staff members from several
countries, from weather services, universities, and research institutes
(Figure 1). The workshop included several
components: invited speakers, contributed talks, a poster session,
working group meetings and reports, and a panel discussion. These
components focused on three general themes: User and Operational Issues;
Scaling and Observations; and Advanced Methods (including ensemble
methods). Most of the individual and working group presentations are
available on the workshop web page (http://www.rap.ucar.edu/research/verification/ver_wkshp1.html),
along with the workshop program. A summary report on the workshop
also is available. A number of conclusions can be drawn from the workshop
presentations and discussions:
3.
New verification approaches for precipitation and convective weather
Standard
approaches for verification of forecasts of convection and precipitation
generally have relied on overlaying grids of observations and grids
of forecasts; the individual grid values for the two fields are compared
and statistics such as the POD (Probability of Detection), FAR (False
Alarm Ratio) and CSI (Critical Success Index) are computed. Unfortunately
these measures do not provide useful information for improvement of
the forecasts. Moreover, these measures can unfairly penalize forecasts
that should be considered "good" (e.g., a forecast area
located adjacent to the observed area has no skill according to these
measures).
In
response to concerns about the limitations of standard verification
approaches for verification of convective and precipitation forecasts,
B. Brown, R. Bullock, and C. Mueller of RAP, along with C. Davis (MMM),
K. Manning (MMM), and R. Morss (MMM and ESIG) are developing an object-based
approach for these evaluations. The goals of this project are to develop
and test new approaches for verification of convective and precipitation
forecasts; characterize precipitation/convective regions in a "natural"
way; tie the verification method development to user studies; and
apply the
approaches developed to nowcasts and NWP forecasts.
The
proposed approach is an adaptable method that is based on attributes
of precipitation objects/shapes and their associated precipitation
values. It will provide the capability to answer a variety of questions
about the forecasts, observations, and their relationship. Specifically,
the approach involves several steps: (1) define the relevant precipitation/convective
objects and shapes; (2) diagnose errors in the location, shape, orientation,
size, timing, etc. of the forecasts; and (3) characterize basic attributes
of the precipitation/convection within the objects (e.g., intensity,
density, etc.). In parallel, R. Morss is investigating users' needs
for precipitation information and information about precipitation
forecast quality, through interviews with flood control managers,
emergency managers, and water resource managers in the Colorado Front
Range.
One
approach for identifying a forecast region of interest is shown in
Figure 2. In this approach, the forecast
region is "smoothed" using a convolving disk. A threshold
value is then applied to filter out regions that are not of interest.
The same approach would be applied to the observations. Because the
original values on the grid are still available, it is possible to
directly compare characteristics of the values inside the objects.
An example of such a comparison is shown in Figure
3.
[TOP]

(2a)

(2b)

(2c)
Figure
2. Example of an approach to defining "objects" in a precipitation
field: (a) the original precipitation field from the
Weather, Research, and Forecasting (WRF) modeling system; (b) the
smoothed WRF precipitation field after a convolving disk has been
applied; and (c) the final field after a threshold has been applied
to the convolved field.
[TOP]

(3a)

(3b)
Figure
3. An example of a defined (bandaid) object applied to an observed
(Stage 4 precipitation) field and to a WRF precipitation field: (a)
objects applied to the original (convolved and thresholded) fields,
with original (upper plots) and optimized (lower right) forecast location
and orientation; and (b) distributions of observed and forecast precipitation
values inside the matched shapes.
Ongoing
work on this study will include developing and testing the ability
to match forecast and observed objects. One aspect of this area of
research will involve investigating the scale of predictability of
different types of phenomena. In addition, the applications of the
research will expand to include nowcasts as well as the numerical
weather prediction forecasts considered thus far. This work was presented
at the USWRP Science Symposium in April, at the Workshop on Making
Verification More Meaningful in July, and at the WWRP International
Conference on Quantitative Precipitation Forecasting in September.
[TOP]
4.
Detecting inhomogeneities in precipitation observations
Detecting
actual changes in the amount or frequency of precipitation received
at a United States cooperative observer network (COOP) station requires
eliminating apparent "changes" that are the result of instrument
drift, alterations in method of measurement and/or reporting, modification
of the station's surroundings, etc. Change point detection is very
challenging even when the measurements possess nice statistical properties
such as normality, continuity, and homogeneity of variance.
However, precipitation data do not possess nice statistical
properties. In fact, the occurrences of precipitation can be relatively
infrequent and when precipitation does occur, the measurements tend
to have a skewed distribution. For these sorts of measurements, use
of standard change point methods is not recommended.
Fortunately,
the COOP network is fairly dense. Each station has several neighbors
also measuring precipitation. These neighbors are being taken advantage
of in this study [by B. Brown, T. Fowler, and E. Tollerud (FSL)] to
develop an alternative approach for detecting inhomogeneities. In
particular, the frequency and amount of precipitation at each station
are compared to its neighbors' values for each month over the entire
period of record. Thus the empirical distribution and time series
of various measures of association between stations are obtained.
New measurements can be compared to these to determine if the recent
measures are "typical" or not. If not, the station can be
flagged as possibly having experienced a change, and further checks
can be performed.
Data
from spring seasons (April - June) were analyzed at all stations in
Iowa. In many ways, this set of data is easy to analyze because precipitation
is plentiful in this area during the spring. Additionally, the terrain
in Iowa is relatively uniform and the COOP network is fairly dense.
However, the precipitation in Iowa during the spring months tends
to be convective in nature. Thus, the precipitation may be very localized.
The
equitable threat score (ETS) is one measure of a station's relationship
to its neighbors. Figure 4 shows
a time series plot of ETS for a particular station in Iowa. The scores
are relatively random in nature until the last three seasons, when
they attain their lowest values. The cause of this apparent inhomogeneity
has yet to be determined. However, the behavior of the scores in the
last three seasons clearly differs from the behavior of the scores
in the preceding seasons.
Figure
4. Time series plot of Equitable Threat Score for each spring season
1950-2001 for a station in Iowa.

Figure
5. Time series plot of Equitable Threat Score for 33 seasons of simulated
data. About half of the observations from the last season were replaced
with observations of no precipitation.
Precipitation
observations were simulated in order to test the efficacy of the methods
on data with known (i.e. constructed) inhomogeneities. Figure 5 shows
the time series of ETS for seven simulated stations. The frequency
of precipitation is different at each station and is indicated by
the color of the line. The inhomogeneity was the same at each station.
Each failed to observe about half (44%) of the time during the final
season. Note the dramatic change in ETS for the last season for all
stations except the station with extremely rare precipitation events
(probability of precipitation p = 1%).
More complete
analyses of the Iowa and simulated data can be found in Tollerud et
al. (2002) and Fowler et al. (2003), respectively. These analyses
confirm that changes of various types, including inhomogeneities,
may be indicated by changes in the scores.
While much progress
has been made in the detection of inhomogeneities, much work remains
to be completed. The scores have only been used on the spring measurements
from Iowa and one set of simulated data. Other states and seasons
must be investigated to determine how well the methods work in less
ideal circumstances. The simulation procedure assumed that there was
very good event agreement between the target station and its neighbors.
Further research will include investigation
of the scores computed on simulated data with less agreement between
neighbors. It is likely that less correlation among neighbors will
result in the scores being less sensitive to inhomogeneities. Additionally,
all of the analyses so far focus on analyzing the seasons separately.
Attempts are being made to homogenize the measurements from different
seasons, so that precipitation totals from all seasons may be considered
together rather than separately, thus yielding a larger sample size
in the same amount of time. Certainly, not all inhomogeneities will
be detectable by these methods. However, the focus
of the future research will be to determine what can be detected,
i.e. how large the change must be, and how soon after the change occurs
will our methods detect it.
References
Tollerud, E. I., B. G. Brown, and T. L. Fowler, 2002: Identifying
Inhomogeneities in precipitation time series: 1. Diagnostic measures
of spatial correlation. 13th Conference on Applied Climatology, Portland,
OR, May 12-16.
Fowler, T. L.,
Tollerud, E. I., and B. G. Brown, 2003: You've Changed! Inhomogeneity
detection for COOP network precipitation measurements. 7th Conference
on Integrated Observing Systems, Long Beach, CA, February 9-13.
[TOP]