Skip to content

Commit

Permalink
Feature #2280 ens_prob (#2823)
Browse files Browse the repository at this point in the history
* Per #2280, update to support probability threshold strings like ==8, where 8 is the number of ensemble members, to create probability bins centered on the n/8 for n = 0 ... 8.

* Per #2280, update docs about probability threshold settings.

* Per #2280, use a loose tolerance when checking for consistent bin widths.

* Per #2280, add a new unit test for grid_stat to demonstrate processing the output from gen_ens_prod.

* Per #2280, when verifying NMEP probability forecasts, smooth the obs data first.

* Per #2280, only request STAT output for the PCT line type to match unit_grid_stat.xml and minimize the new output files.

* Per #2280, update config option docs.

* Per #2280, update config option docs.
  • Loading branch information
JohnHalleyGotway authored Feb 22, 2024
1 parent 67ee04e commit b558794
Show file tree
Hide file tree
Showing 6 changed files with 484 additions and 80 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/testing.yml
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ jobs:
- jobid: 'job1'
tests: 'ascii2nc'
- jobid: 'job2'
tests: 'pb2nc madis2nc pcp_combine'
tests: 'pb2nc madis2nc pcp_combine gen_ens_prod'
fail-fast: false
steps:
- uses: actions/checkout@v4
Expand Down Expand Up @@ -176,7 +176,7 @@ jobs:
- jobid: 'job1'
tests: 'ascii2nc_indy pb2nc_indy tc_dland tc_pairs tc_stat plot_tc tc_rmw rmw_analysis tc_diag tc_gen'
- jobid: 'job2'
tests: 'met_test_scripts mode_multivar mode_graphics mtd regrid airnow gsi_tools netcdf modis series_analysis gen_ens_prod wwmca_regrid gen_vx_mask grid_weight interp_shape grid_diag grib_tables lidar2nc shift_data_plane trmm2nc aeronet wwmca_plot ioda2nc gaussian'
tests: 'met_test_scripts mode_multivar mode_graphics mtd regrid airnow gsi_tools netcdf modis series_analysis wwmca_regrid gen_vx_mask grid_weight interp_shape grid_diag grib_tables lidar2nc shift_data_plane trmm2nc aeronet wwmca_plot ioda2nc gaussian'
fail-fast: false
steps:
- uses: actions/checkout@v4
Expand Down
70 changes: 50 additions & 20 deletions docs/Users_Guide/config_options.rst
Original file line number Diff line number Diff line change
Expand Up @@ -101,13 +101,13 @@ The configuration file language supports the following data types:
the user has already determined to be 2.5 outside of MET.

* "==FBIAS" for a user-specified frequency bias value.
e.g. "==FBIAS1" to automatically de-bias the data, "==FBIAS0.9" to select a low-bias threshold, or "==FBIAS1.1" to select a high-bias threshold.
This option must be used in
conjunction with a simple threshold in the other field. For example,
when "obs.cat_thresh = >5.0" and "fcst.cat_thresh = ==FBIAS1;",
MET applies the >5.0 threshold to the observations and then chooses a
forecast threshold which results in a frequency bias of 1.
The frequency bias can be any float value > 0.0.
e.g. "==FBIAS1" to automatically de-bias the data, "==FBIAS0.9" to
select a low-bias threshold, or "==FBIAS1.1" to select a high-bias
threshold. This option must be used in conjunction with a simple
threshold in the other field. For example, when "obs.cat_thresh = >5.0"
and "fcst.cat_thresh = ==FBIAS1;", MET applies the >5.0 threshold to
the observations and then chooses a forecast threshold which results in
a frequency bias of 1. The frequency bias can be any float value > 0.0.

* "CDP" for climatological distribution percentile thresholds.
These thresholds require that the climatological mean and standard
Expand Down Expand Up @@ -842,32 +842,37 @@ to be verified. This dictionary may include the following entries:

When set as a boolean to TRUE, it indicates that the "fcst.field" data
should be treated as probabilities. For example, when verifying the
probabilistic NetCDF output of Ensemble-Stat, one could configure the
Grid-Stat or Point-Stat tools as follows:
probabilistic NetCDF output of Gen-Ens-Prod for an ensemble of size 10,
one could configure the Grid-Stat or Point-Stat tools as follows:

.. code-block:: none
fcst = {
field = [ { name = "APCP_24_A24_ENS_FREQ_gt0.0";
level = "(*,*)";
prob = TRUE; } ];
field = [ { name = "APCP_24_A24_ENS_FREQ_gt0.0";
level = "(*,*)";
cat_thresh = ==10;
prob = TRUE; } ];
}
Setting "prob = TRUE" indicates that the "APCP_24_A24_ENS_FREQ_gt0.0"
data should be processed as probabilities.
data should be processed as probabilities. Setting "cat_thresh = ==10"
indicates that these probabilities are derived from an ensemble with 10
members and 11 probability bins should be defined, each centered on the
value n/10 for n = 0, 1, ... 10.

When set as a dictionary, it defines the probabilistic field to be
used. For example, when verifying GRIB files containing probabilistic
data, one could configure the Grid-Stat or Point-Stat tools as
follows:
data, one could configure the Grid-Stat or Point-Stat tools as follows:

.. code-block:: none
fcst = {
field = [ { name = "PROB"; level = "A24";
prob = { name = "APCP"; thresh_lo = 2.54; } },
{ name = "PROB"; level = "P850";
prob = { name = "TMP"; thresh_hi = 273; } } ];
field = [ { name = "PROB"; level = "A24";
prob = { name = "APCP"; thresh_lo = 2.54; }
cat_thresh = ==0.25; },
{ name = "PROB"; level = "P850";
prob = { name = "TMP"; thresh_hi = 273; }
cat_thresh = ==0.1; } ];
}
The example above selects two probabilistic fields. In both, "name"
Expand All @@ -883,6 +888,31 @@ to be verified. This dictionary may include the following entries:
with a range [0, 100], it will automatically rescale it to be [0, 1]
before applying the probabilistic verification methods.

Probabilistic statistics in MET are derived from an Nx2 probabilistic
contingency table. The N-dimension is determined by the number of
probability bins requested. The "cat_thresh" configuration option
defines the number of and size of these probabibility bins. The bins
must include the full range of possible probability values, [0, 1].
Since selecting bins of equal width is common, shorthand notation is
provided to do so. The following options are supported.

* :code:`cat_thresh = [ ==0.25 ];` specifies an equal probability bin
width of 0.25 and defines 4 bins between the values 0, 0.25, 0.5, 0.75,
and 1.0. The :code:`==p` threshold may be set to any probability bin
width greater than 0 and less than 1.

* :code:`cat_thresh = [ ==10 ];` specifies probability bins for an
ensemble of size 10 and defines 11 bins between the values -0.05, 0.05,
0.15, ..., 0.95, and 1.05. Note that each bin is centered on the
probability value n/10, for n = 0 to 10. The :code:`==n` threshold may
be set to any integer number of ensemble members greater than 1 to
define n+1 probability bins.

* :code:`cat_thresh = [ >=0, >=0.5, >=0.75, >=1.0 ];` explicitly
specifies the probability thresholds and defines 3 bins of unequal
width between the values 0, 0.5, 0.75, and 1.0. By convention, the
greater-than-or-equal-to (">=" or "ge") inequality type is required.

* Set "prob_as_scalar = TRUE" to override the processing of probability
data. When the "prob" entry is set as a dictionary to define the
field of interest, setting "prob_as_scalar = TRUE" indicates that this
Expand Down Expand Up @@ -2047,7 +2077,7 @@ This dictionary may include the following entries:
.. code-block:: none
hira = {
flag = FALSE;
flag = FALSE;
width = [ 2, 3, 4, 5 ];
vld_thresh = 1.0;
cov_thresh = [ ==0.25 ];
Expand Down
Loading

0 comments on commit b558794

Please sign in to comment.