Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Use Case: construct use case verifying GFS cloud forecasts vs. ASOS ceiolometer #2745

Open
34 tasks
DanielAdriaansen opened this issue Oct 24, 2024 · 12 comments
Assignees
Labels
METplus: Clouds reporting: NRL METplus Naval Research Laboratory METplus Project requestor: Navy/NRL Naval Research Laboratory type: new use case Add a new use case
Milestone

Comments

@DanielAdriaansen
Copy link
Contributor

DanielAdriaansen commented Oct 24, 2024

Describe the New Use Case

This use case will demonstrate verifying forecasts of cloud information on the GFS global 0.25 degree grid, using ASOS ceilometer cloud base height (ceiling) observations over CONUS using PointStat.

The fields to verify will be cloud fields we identify in GFS output such as:

  • Cloud Base Height

We will need to search for and identify the proper field names and levels in the GFS files.

The measures of skill that should be included are:

  • Gilbert Skill Score
  • Equitable Threat Score
  • Fraction Skill Score
  • False Alarm Rate
  • Hit Rate
  • Bias

The end goal is for the user to be able to substitute the GFS forecasts with a separate GFS-based AI/ML cloud forecast product on the same GFS 0.25 degree grid, to compare with the ASOS observations. This framework is to support them to be able to do this. We may get some sample data of their truth, however due to restrictions on releasing the data we may need to leave the GFS forecast data in place.

The user would also like to be able to stratify forecast performance based on categories of cloud types. These cloud types will be provided later on, but we should brainstorm another type of stratification we can perform using an external classification (maybe weather regimes? precipitation type?), or, implement some simple post-processing, for example stratify performance by all clouds e.g. >= 8 km ("high clouds").

Checklist to get working:

Use Case Name and Category

`model_applications/clouds/PointStat_fcstGFS_obsASOSCeilometer_cloudTop

Input Data

FCST: GFS 0.25 degree forecasts and analyses.
OBS: ASOS Ceilometer

Acceptance Testing

Describe tests required for new functionality.
As use case develops, provide a run time here

Time Estimate

Estimate the amount of work required here.
Issues should represent approximately 1 to 3 days of work.

Sub-Issues

Consider breaking the new feature down into sub-issues.

  • Add a checkbox for each sub-issue here.

Relevant Deadlines

Must be completed by 12/31/2024

Funding Source

7730022

Define the Metadata

Assignee

  • Select engineer(s) or no engineer required
  • Select scientist(s) or no scientist required

Labels

  • Review default alert labels
  • Select component(s)
  • Select priority
  • Select requestor(s)
  • Select privacy

Milestone and Projects

  • Select Milestone as a METplus-Wrappers-X.Y.Z version, Consider for Next Release, or Backlog of Development Ideas
  • For a METplus-Wrappers-X.Y.Z version, select the METplus-Wrappers-X.Y.Z Development project

Define Related Issue(s)

Consider the impact to the other METplus components.

New Use Case Checklist

See the METplus Workflow for details.

  • Complete the issue definition above, including the Time Estimate and Funding source.
  • Fork this repository or create a branch of develop.
    Branch name: feature_<Issue Number>_<Description>
  • Complete the development and test your changes.
  • Add/update log messages for easier debugging.
  • Add/update unit tests.
  • Add/update documentation.
  • Add any new Python packages to the METplus Components Python Requirements table.
  • For any new datasets, an entry to the METplus Verification Datasets Guide.
  • Push local changes to GitHub.
  • Submit a pull request to merge into develop.
    Pull request: feature <Issue Number> <Description>
  • Define the pull request metadata, as permissions allow.
    Select: Reviewer(s) and Development issue
    Select: Milestone as the next official version
    Select: METplus-Wrappers-X.Y.Z Development project for development toward the next official release
  • Iterate until the reviewer(s) accept your changes. Merge branch into develop.
  • Create a second pull request to merge develop into develop-ref, following the same steps for the first pull request.
  • Delete your fork or branch.
  • Close this issue.
@DanielAdriaansen
Copy link
Contributor Author

DanielAdriaansen commented Oct 30, 2024

GFS data are here:
/d1/projects/METplus/METplus_Data/development/nrl/cloud/GFS_0.25

Using this command:

wgrib2 -v gfs.0p25.2024030700.f012.grib2 | grep cloud

I see relevant variables:

630:450488137:d=2024030700:LCDC Low Cloud Cover [%]:low cloud layer:12 hour fcst:
631:451333325:d=2024030700:LCDC Low Cloud Cover [%]:low cloud layer:6-12 hour ave fcst:
632:452258290:d=2024030700:MCDC Medium Cloud Cover [%]:middle cloud layer:12 hour fcst:
633:452866413:d=2024030700:MCDC Medium Cloud Cover [%]:middle cloud layer:6-12 hour ave fcst:
634:453577593:d=2024030700:HCDC High Cloud Cover [%]:high cloud layer:12 hour fcst:
635:454309341:d=2024030700:HCDC High Cloud Cover [%]:high cloud layer:6-12 hour ave fcst:
638:457005258:d=2024030700:HGT Geopotential Height [gpm]:cloud ceiling:12 hour fcst:
639:458203404:d=2024030700:PRES Pressure [Pa]:convective cloud bottom level:12 hour fcst:
640:458750807:d=2024030700:PRES Pressure [Pa]:low cloud bottom level:6-12 hour ave fcst:
641:460188975:d=2024030700:PRES Pressure [Pa]:middle cloud bottom level:6-12 hour ave fcst:
642:461414641:d=2024030700:PRES Pressure [Pa]:high cloud bottom level:6-12 hour ave fcst:
643:462935144:d=2024030700:PRES Pressure [Pa]:convective cloud top level:12 hour fcst:
644:463536141:d=2024030700:PRES Pressure [Pa]:low cloud top level:6-12 hour ave fcst:
645:464992169:d=2024030700:PRES Pressure [Pa]:middle cloud top level:6-12 hour ave fcst:
646:466180239:d=2024030700:PRES Pressure [Pa]:high cloud top level:6-12 hour ave fcst:
647:467707298:d=2024030700:TMP Temperature [K]:low cloud top level:6-12 hour ave fcst:
648:468725963:d=2024030700:TMP Temperature [K]:middle cloud top level:6-12 hour ave fcst:
649:469628288:d=2024030700:TMP Temperature [K]:high cloud top level:6-12 hour ave fcst:
650:470836489:d=2024030700:TCDC Total Cloud Cover [%]:convective cloud layer:12 hour fcst:
651:471552157:d=2024030700:TCDC Total Cloud Cover [%]:boundary layer cloud layer:6-12 hour ave fcst:

In addition, there is an isobaric "total cloud cover" variables (TCDC) at various pressure levels in the file.

ASOS data are here:

/d1/projects/METplus/METplus_Data/development/nrl/cloud/metar

ASOS data should be provided via Python embedding, unless somehow that NetCDF file is supported by MET but I doubt it. Note that dtcenter/MET#187 mentions supporting these directly in MET as a nice source of point obs, but for this use case we'll probably just use Python embedding.

@j-opatz
Copy link
Contributor

j-opatz commented Nov 4, 2024

@hertneky would you be willing to begin working on a Python script to ingest the METAR file? I confirmed Dan's suspicion that the file will need Python Embedding for MET to accept it:

DEBUG 1: Reading point observation file: /d1/projects/METplus/METplus_Data/development/nrl/cloud/metar/Surface_METAR_20240307_0000.nc
terminate called after throwing an instance of 'netCDF::exceptions::NcBadId'
  what():  NetCDF: Not a valid ID

Any of the variable fields could work for testing (and the script should be able to accept anything the user requests), but a smart starting point might be low_cloud_area_fraction and low_cloud_base_altitude. Those have similar fields for middle and high clouds, and I know GFS will have fields that we can use to compare these.

A basic test-of-purpose can be done in Plot-Point-Obs; if you prefer, you can also start with this PointStat use case that calls GFS as the forecast (you'd have to remove the PB2NC call, though): https://metplus.readthedocs.io/en/develop/generated/model_applications/medium_range/PointStat_fcstGFS_obsGDAS_UpperAir_MultiField_PrepBufr.html

@hertneky
Copy link
Contributor

hertneky commented Nov 5, 2024

@j-opatz I can start working on this. Thanks.

@JohnHalleyGotway JohnHalleyGotway moved this from 🟢 Ready to 🏗 In progress in METplus-Wrappers-6.1.0 Development Nov 6, 2024
@j-opatz
Copy link
Contributor

j-opatz commented Nov 18, 2024

@hertneky are there any updates/progress on the script? Anything you need in terms of help in the use case creation? I know that once the script is finished, the configuration file setup will be very simple so it's OK if this part is taking more time.

@hertneky
Copy link
Contributor

@j-opatz I have a script that is still in the works. I am testing with low cloud base from both files. Of course one is in 'm' and the other is 'Pa', so there's a conversion needed. Not sure if I should do that in the python script or use the convert(x) function available in MET.
Anyways, I have the data read in, but may have questions as to the format to hand to point_stat.

@j-opatz
Copy link
Contributor

j-opatz commented Nov 19, 2024

Thanks for the update, @hertneky.

Given my impending leave, @DanielAdriaansen should be able to provide some guidance, or find someone to step in on my behalf for direction on this use case.

I'm a little concerned about the meters and Pascals comparison, though. They are measures of completely different things, and the relationship between the two is dependent on too many assumptions.

Until the NRL team at-large can provide further guidance on the m-Pa issue, try and focus on the [cloud_level]_area_fraction variables. GFS data has cloud cover variables and that would be a percentage to percentage comparison, no conversions necessary.

@hertneky
Copy link
Contributor

@j-opatz Yeah, you're right that the height would really be an approximation from cloud base pressure. For say low cloud area fraction, my concern is the layers used for low/mid/high being different between the two, but the units are the same. Keep me posted on what NRL says about the differing units for cloud base.

@hertneky
Copy link
Contributor

hertneky commented Dec 9, 2024

@DanielAdriaansen I added the config file and python embedding script to a new metplus branch "feature_2745_nrl_gfs_asos". The data itself is in /d1/projects/METplus/METplus_Data/development/nrl/cloud/ as you posted further up. The error I get is "ERROR : Observation::Observation(const Python3_List) -> bad time string: "ADPSFC"" I'm not sure what bad time string means. My lists look okay to me after print out and the types seem okay too, either string or numeric. Hope you can shed some light!

@DanielAdriaansen
Copy link
Contributor Author

OK it looks like this line:

point_data = [ msg_type, stn_id, vld_time, lat, lon, elev, var_name, level, hgt, qc_string, obs_val]

Is creating 11 "observations" (rows), with 130427 columns. Instead, we need 130427 rows, with 11 columns.

You'll have to re-arrange your individual lists so that each item in point_data is a list of 11 items. One quick way to do that is something like:

point_data = [[a,b,c,d,e,f,g,h,i,j,k] for a,b,c,d,e,f,g,h,i,j,k in tuple(zip(msg_type,stn_id,vld_time,lat,lon,elev,var_name,level,hgt,qc_string,obs_val))]

After I did this, I get past the ADPSFC error, but now I am getting a new error:

ERROR  : pyobject_as_double (PyObject *) -> bad object type

This is telling me that MET is having trouble converting one of the objects (column values) into a double. Unfortunately the error doesn't tell you which column it's having trouble with, but it's one of the numeric columns.

If I change the construction of the point_data call to cast each numeric type to float like this:

point_data = [[a,b,c,float(d),float(e),float(f),g,float(h),float(i),j,float(k)] for a,b,c,d,e,f,g,h,i,j,k in tuple(zip(msg_type,stn_id,vld_time,lat,lon,elev,var_name,level,hgt,qc_string,obs_val))]

Then it works!

DEBUG 2: Processing LCDC/L0 versus low_cloud_area_fraction/L0, for observation type ADPSFC, over region FULL, for interpolation method BILIN(4), using 3034 matched pairs.
DEBUG 3: Number of matched pairs   = 3034
DEBUG 3: Observations processed    = 130427
DEBUG 3: Rejected: station id      = 0
DEBUG 3: Rejected: obs var name    = 0
DEBUG 3: Rejected: valid time      = 125448
DEBUG 3: Rejected: bad obs value   = 1945
DEBUG 3: Rejected: off the grid    = 0
DEBUG 3: Rejected: topography      = 0
DEBUG 3: Rejected: level mismatch  = 0
DEBUG 3: Rejected: quality marker  = 0
DEBUG 3: Rejected: message type    = 0
DEBUG 3: Rejected: masking region  = 0
DEBUG 3: Rejected: bad fcst value  = 0
DEBUG 3: Rejected: bad climo mean  = 0
DEBUG 3: Rejected: bad climo stdev = 0
DEBUG 3: Rejected: mpr filter      = 0
DEBUG 3: Rejected: duplicates      = 0
DEBUG 2: Computing Categorical Statistics.
DEBUG 2: Computing Scalar Partial Sums and Continuous Statistics.

I'll let you decide how to modify your script, but ultimately it boils down to:

  1. Making sure you have 11 columns with N rows (1 row per observation) rather than 1 column per observation with 11 rows
  2. Very carefully controlling the type of each piece of data in each list before passing it to MET

@hertneky
Copy link
Contributor

@DanielAdriaansen Ah hah - I hadn't even thought about the fact that I had the array list flipped. Thanks. The numeric complaint is probably on the one that's type int, the others are already float. I wasn't sure about that, but didn't have a way to test until I got passed the original issue.

@DanielAdriaansen
Copy link
Contributor Author

@hertneky can you move your use case from this location:
met_tool_wrapper/PointStat/PointStat_python_embedding_fcstGFS_obsASOS_clouds

to this location:
model_applications/clouds/PointStat_fcstGFS_obsASOS_cloudFraction_cloudBaseHeight

when you have a chance? Thanks!

@hertneky
Copy link
Contributor

Sure thing! @DanielAdriaansen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
METplus: Clouds reporting: NRL METplus Naval Research Laboratory METplus Project requestor: Navy/NRL Naval Research Laboratory type: new use case Add a new use case
Projects
Status: 🏗 In progress
Development

No branches or pull requests

3 participants