Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: CDAT migration: various KeyError for longitude bounds #884

Open
chengzhuzhang opened this issue Oct 31, 2024 · 4 comments
Open

[Bug]: CDAT migration: various KeyError for longitude bounds #884

chengzhuzhang opened this issue Oct 31, 2024 · 4 comments
Labels
bug Bug fix (will increment patch version)

Comments

@chengzhuzhang
Copy link
Contributor

chengzhuzhang commented Oct 31, 2024

What happened?

When testing the run script that specify one year for eamxx data. I ran into multiple KeyError for different variables. Examples:
KeyError: 'lon_bnds'
KeyError: 'bounds_lon'
KeyError: 'longitude_bnds'

It is Spooky!

What did you expect to happen? Are there are possible answers you came across?

No response

Minimal Complete Verifiable Example (MVCE)

https://portal.nersc.gov/cfs/e3sm/zhang40/tests/eamxx/eamxx_decadal_1996_1031_edv3/prov/

python run_e3sm_diags_1996.py -d lat_lon_model_vs_obs_1996.cfg

EXAMPLE:

2024-10-31 14:26:18,207 [INFO]: lat_lon_driver.py(run_diag:69) >> Variable: PRECT
2024-10-31 14:26:34,122 [INFO]: dataset_xr.py(_get_land_sea_mask:1470) >> Variable 'LANDFRAC' was not in the file '/global/cfs/cdirs/e3sm/chengzhu/eamxx/post/data/rgr/eamxx_decadal_ANN_199601_199612_climo.nc', nor was it defined in the derived variables dictionary.. Using default land sea mask located at `/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/share/e3sm_diags/acme_ne30_ocean_land_mask.nc`.
2024-10-31 14:26:34,180 [INFO]: regrid.py(subset_and_align_datasets:70) >> Selected region: global
2024-10-31 14:26:41,216 [ERROR]: core_parameter.py(_run_diag:343) >> Error in e3sm_diags.driver.lat_lon_driver
Traceback (most recent call last):
  File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/xarray/core/dataset.py", line 1447, in _construct_dataarray
    variable = self._variables[name]
KeyError: 'lon_bnds'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/e3sm_diags/parameter/core_parameter.py", line 340, in _run_diag
    single_result = module.run_diag(self)
  File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/e3sm_diags/driver/lat_lon_driver.py", line 110, in run_diag
    _run_diags_2d(
  File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/e3sm_diags/driver/lat_lon_driver.py", line 339, in _run_diags_2d
    metrics_dict = _create_metrics_dict(
  File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/e3sm_diags/driver/lat_lon_driver.py", line 593, in _create_metrics_dict
    "mean": spatial_avg(ds_ref, var_key),  # type: ignore
  File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/e3sm_diags/metrics/metrics.py", line 56, in spatial_avg
    weights = _get_weights(ds, var_key, axis)
  File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/e3sm_diags/metrics/metrics.py", line 222, in _get_weights
    spatial_wts = ds.spatial.get_weights(spatial_axis, data_var=var_key)
  File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/xcdat/spatial.py", line 274, in get_weights
    d_bounds = self._dataset.bounds.get_bounds(axis=key, var_key=data_var)
  File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/xcdat/bounds.py", line 255, in get_bounds
    bounds: Union[xr.Dataset, xr.DataArray] = self._dataset[
  File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/xarray/core/dataset.py", line 1545, in __getitem__
    return self._construct_dataarray(key)
  File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/xarray/core/dataset.py", line 1449, in _construct_dataarray
    _, name, variable = _get_virtual_variable(self._variables, name, self.dims)
  File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/xarray/core/dataset.py", line 214, in _get_virtual_variable
    raise KeyError(key)
KeyError: 'lon_bnds'

Relevant log output

https://portal.nersc.gov/cfs/e3sm/zhang40/tests/eamxx/eamxx_decadal_1996_1031_edv3/prov/e3sm_diags_run.log

Anything else we need to know?

No response

Environment

latest cdat-migration-fy24 branch

@chengzhuzhang chengzhuzhang added the bug Bug fix (will increment patch version) label Oct 31, 2024
@chengzhuzhang
Copy link
Contributor Author

chengzhuzhang commented Oct 31, 2024

@tomvothecoder This problem only occurred in this model (climo) vs obs (specify years). I ran another case for model (climo) vs obs (climo) without error. It is possible that the longitude bounds of some time-series observational data are unconventional, and xcdat spatial averaging won't operate on those, but in that case the model data should create figure...Need to troubleshooting more.

@chengzhuzhang
Copy link
Contributor Author

i have xcdat 0.6.1 in my env, I'm updating to latest 0.7.2 and retry.

@chengzhuzhang
Copy link
Contributor Author

chengzhuzhang commented Oct 31, 2024

Updating to xcdat = 0.7.2 didn't work. I think there are two things to fix:

  1. For datasets don't have lat/lon bounds that can be detected by xcdat/xarray, we should use xcdat to add missing bounds. Maybe cdms2 does it by default when read in the dataset?
  2. Somehow if a reference data failed to create metrics, the plot for test data won't generate. I think the expected behavior should be create the model data only figure anyway..

@tomvothecoder tomvothecoder mentioned this issue Nov 4, 2024
9 tasks
@tomvothecoder
Copy link
Collaborator

I think I found the root cause. In the code block below, the climatology is being calculated from reference time series datasets.

if self.is_time_series:
ds = self.get_time_series_dataset(var)
ds_climo = climo(ds, self.var, season).to_dataset()
return ds_climo

The climo() function returns the climatology DataArray, which is converted to an xr.Dataset (ds_climo). There are no bounds generated for ds_climo, resulting in downstream issues such as missing bounds in spatial_avg().

This is now fixed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug fix (will increment patch version)
Projects
None yet
Development

No branches or pull requests

2 participants