Mesoscale Model Evaluation

Mesoscale Model Evaluation Mike Baldwin Cooperative Institute for Mesoscale Meteorological Studies, University of Oklahoma Also affiliated with NOAA/NSSL and NOAA/NWS/SPC

NWS – forecasts on hi-res grids • What would you suggest that NWS do to verify these forecasts?

Issues in mesoscale verification • Validate natural behavior of forecasts • Realistic variability, structure of fields • Do predicted events occur with realistic frequency? • Do characteristics of phenomena mimic those found in nature? • Traditional objective verification techniques are not able to address these issues

Outline • Problems with traditional verification • Solutions: • Verify characteristics of phenomena • Verify structure/variability • Design verification systems that address value of forecasts

Traditional verification • Compare a collection of matching pairs of forecast and observed values at the same set of points in space/time • Compute various measures of accuracy: RMSE, bias, equitable threat score • A couple of numbers may represent the accuracy of millions of model grid points, thousands of cases, hundreds of meteorological events • Boiling down that much information into one or two numbers is not very meaningful

Dimensionality of verification info • Murphy (1991) and others highlight danger of simplifying complex verification information • High-dimension information = data overload • Verification information should be easy to understand • Need to find ways to measure specific aspects of performance

Quality vs. value • Scores typically measure quality, or degree in which forecasts and observations agree • Forecast value is benefit of forecast information to decision maker • Value is subjective, complex function of quality • High-quality forecast may be of low value and vice versa

OBSERVED FCST #1: smooth Forecast #1: smooth OBSERVED FCST #2: detailed

Traditional “measures-oriented” approach to verifying these forecasts

Phase/timing errors • High-amplitude, small-scale forecast and observed fields are most sensitive to timing/phase errors

Mean Squared Error (MSE) • For 1 point phase error MSE = 0.0016

Verify forecast “realism” • Anthes (1983) paper suggests several ways to verify “realism” • Verify characteristics of phenomena • Decompose forecast errors as function of spatial scale • Verify structure/variance spectra

Characterize the forecast and observed fields • Verify the forecast with a similar approach as a human forecaster might visualize the forecast/observed fields • Characterize features, phenomena, events, etc. found in forecast and observed fields by assigning attributes to each object • Not an unfamiliar concept: • “1050 mb high” • “category 4 hurricane” • “F-4 tornado”

Many possible ways to characterize phenomena • Shape, orientation, size, amplitude, location • Flow pattern • Subjective information (confidence, difficulty) • Physical processes in a NWP model • Verification information can be stratified using this additional information

“Object-oriented” approach to verification • Decompose fields into sets of “objects” that are identified and described by a set of attributes in an automated fashion • Using image processing techniques to locate and identify events • Produce “scores” or “metrics” based upon the similarity/dissimilarity between forecast and observed events • Could also examine the joint distribution of forecast and observed events

Characterization: How? • Identify an object Usually involves complex image processing Event #16

Characterization: How? • Assign attributes Examples: location, mean, orientation, structure Event #16: Lat=37.3N, Lon=87.8W, q=22.3, b=2.1

Automated rainfall object identification • Contiguous regions of measurable rainfall (similar to CRA; Ebert and McBride (2000))

Connected component labeling

Expand area by 15%, connect regions that are within 20km, relabel

Object characterization • Compute attributes

fcst #1 fcst #2 Verification of detailed forecasts observed RMSE = 3.4 MAE = 0.97 ETS = 0.06 • 12h forecasts of 1h precipitation valid 00Z 24 Apr 2003 RMSE = 1.7 MAE = 0.64 ETS = 0.00

fcst #1 fcst #2 Verification b = 7.8 ecc 20 = 3.6 ecc 40 = 3.1 ecc 60 = 4.5 ecc 80 = 3.6 observed • 12h forecasts of 1h precipitation valid 00Z 24 Apr 2003 b = 3.1 ecc 20 = 2.6 ecc 40 = 2.0 ecc 60 = 2.1 ecc 80 = 2.8 b = 1.6 ecc 20 = 10.7 ecc 40 = 7.5 ecc 60 = 4.3 ecc 80 = 2.8

Example of scores produced by this approach • fi = (ai, bi, ci, …, xi, yi)t • ok= (ak, bk, ck, …, xk, yk)t • di,k(fi,ok) = (fi-ok)t A (fi-ok) (Generalized Euclidean distance, measure of dissimilarity) where A is a matrix, different attributes would probably have different weights • ci,k(fi,ok) = cov(fi,ok) (measure of similarity)

Ebert and McBride (2000) • Contiguous Rain Areas • Separate errors into amplitude, displacement, shape components

Contour error map (CEM) method • Case et al (2003) • Phenomena of interest – Florida sea breeze • Object identification – sea breeze transition time • Contour map of transition time errors • Distributions of timing errors • Verify post-sea breeze winds

Nachamkin (2004) Identify events of interest in the forecasts Collect coordinated samples Compare forecast PDF to observed PDF Repeat process for observed events Compositing

Decompose errors as a function of scale • Bettge and Baumhefner (1980) used band-pass filters to analyze errors at different scales • Briggs and Levine (1997) used wavelet analysis of forecast errors

Verify structure • Fourier energy spectra • Take Fourier transform, multiply by complex conjugate – E(k) • Display on log-log plot • Natural phenomena often show “power-law” regimes • Noise (uncorrelated) results in flat spectrum

Fourier spectra • Slope of spectrum indicates degree of structure in the data

Larger absolute values of slope correspond with less structure slope = -1 noise slope = -1.5 slope = -3

Multiscale statistical properties (Harris et al 2001) • Fourier energy spectrum • Generalized structure function: spatial correlation • Moment-scale analysis: intermittency of a field, sparseness of sharp intensities • Looking for “power law”, much like in atmospheric turbulence (–5/3 slope) FIG. 3. Isotropic spatial Fourier power spectral density (PSD) for forecast RLW (qr; dotted line) and radar-observed qr (solid line). Comparison of the spectra shows reasonable agreement at scales larger than 15 km. For scales smaller than 15 km, the forecast shows a rapid falloff in variability in comparison with the radar. The estimated spectral slope with fit uncertainty is = 3.0 ± 0.1

Example Obs_4 Eta_12 Eta_8 log[E(k)] log[wavenumber] 3-6h forecasts from 04 June 2002 1200 UTC WRF_22 WRF_10 KF_22

3-6h forecasts from 04 June 2002 1200 UTC

June 2002 00z runs 12, 24, 36, 48h fcsts

Comparing forecasts that contain different degrees of structure Obs=black Detailed = blue Smooth = green MSE detailed = 1.57 MSE smooth = 1.43

Common resolved scales vs. unresolved • Filter other forecasts to have same structure • MSE “detailed” = 1.32 • MSE smooth = 1.43

Lack of detail in analyses • Methods discussed assume realistic analysis of observations • Problems: Relatively sparse observations • Operational data assimilation systems • Smooth first guess fields from model forecasts • Smooth error covariance matrix • Smooth analysis fields result

True mesoscale analyses • Determine what scales are resolved • Mesoscale data assimilation • Frequent updates • All available observations • Hi-res NWP provides first guess • Ensemble Kalman filter • Tustison et al. (2002) scale-recursive filtering takes advantage of natural “scaling”

Design verification systems that address forecast value • Value measures the benefits of forecast information to users • Determine what aspects of forecast users are most sensitive to • If possible, find out users “cost/loss” situation • Are missed events or false alarms more costly?

Issues • How to distill the huge amount of verification information into meaningful “nuggets” that can be used effectively? • How to elevate verification from an annoyance to an integral part of the forecast process? • What happens when conflicting information from different verification approaches is obtained?

Summary • Problems with traditional verification techniques when used with forecasts/observations with structure • Verify realism • Issues of scale • Work with forecasters/users to determine most important aspects of forecast information

References • Good books • Papers mentioned in this presentation • Beth Ebert’s website

Scores based on similarity/dissimilarity matrices • D = [di,j] euclidean distance matrix • C = [ci,j] covariance matrix • Scores could be: tr[D] = trace of matrix, for euclidean distance this equates to S (fi – oi)2 ~ RMSE det[D] = determinant of matrix, a measure of the magnitude of a matrix

Mesoscale Model Evaluation

Mesoscale Model Evaluation

Presentation Transcript

Model Evaluation and Selection

Model Evaluation

Model Evaluation

Model-Based Evaluation

Mesoscale Modeling

Model Evaluation

CIPP Evaluation Model

Mesoscale Numerical Weather Prediction With the WRF Model

Mesoscale model experience with UM at met.no

EVALUATION OF A NON-HYDROSTATIC MESOSCALE MODEL FOR DISCHARGE FORECASTING

Mesoscale model ensemble for MDSS

State Teacher Evaluation Model

Mesoscale modelling with the Lattice Boltzmann model

TEAM Evaluation Model Overview

EVALUATION OF MODEL REPORTS

Mesoscale/sub-mesoscale dynamics and SWOT

HCPSS Teacher Evaluation Model

Ensembling Mesoscale Model Data

Model Evaluation

An Evaluation of Positioning Error Estimated by the Mesoscale Non-Hydrostatic Model

Mesoscale Applications for Microscale Model?

Evaluation of Model Performance