-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updating CMORization episode #313
base: main
Are you sure you want to change the base?
Changes from 5 commits
b1da7a1
1a14aba
5be74d5
1bc58f6
d4c9e77
37aa227
0ef9a3a
e6f6eec
3c9c652
d6ffa26
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,7 +2,7 @@ | |
title: "CMORization: adding new datasets to ESMValTool" | ||
teaching: 15 | ||
exercises: 45 | ||
compatibility: ESMValTool v2.6.0 | ||
compatibility: ESMValTool v2.10.0 | ||
|
||
questions: | ||
- "CMORization: what is it and why do we need it?" | ||
|
@@ -123,6 +123,12 @@ run the CMORizer scripts: | |
esmvaltool data format --config_file <path to config-user.yml> <dataset-name> | ||
``` | ||
|
||
The options `--start` and `--end` can be added to command above to restrict the | ||
formatting of raw data to a time range. They will be ignored if a specific | ||
dataset does not support (i.e. because it is provided as a single file). | ||
Valid formats are `YYYY`, `YYYYMM`, `YYYYMMDD`. The same way is also applicable for | ||
the option `esmvaltool data download`. | ||
|
||
The ``config-user.yml`` is the file in which we define the different data | ||
paths, see the episode on [Configuration]({{ page.root }}{% link _episodes/03-configuration.md %}). | ||
In the ``rootpath`` of your ``config-user.yml``, make sure to add the right | ||
|
@@ -141,38 +147,52 @@ name that was created to store the raw observation data files, i.e. | |
If everything is okay, the output should look something like this: | ||
|
||
~~~ | ||
... | ||
... Starting the CMORization Tool at time: 2022-07-26 14:02:16 UTC | ||
... Writing program log files to: | ||
/scratch/b/b309059/esmvaltool_output/data_formatting_20240527_132448/run/main_log.txt | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can all references to b309059 be changed to |
||
/scratch/b/b309059/esmvaltool_output/data_formatting_20240527_132448/run/main_log_debug.txt | ||
... Starting the CMORization Tool at time: 2024-05-27 13:24:48 UTC | ||
... ---------------------------------------------------------------------- | ||
... input_dir = /home/peter/data/RAWOBS | ||
... output_dir = /home/peter/esmvaltool_output/data_formatting_20220726_140216 | ||
... input_dir = /work/bd0854/DATA/ESMValTool2/RAWOBS | ||
... output_dir = /scratch/b/b309059/esmvaltool_output/data_formatting_20240527_132448 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is not clear if b309059 is a user name. I suggest changing this to username and mentioing that this will be substituted by the individual's username (maybe at the start of the episode). |
||
... ---------------------------------------------------------------------- | ||
... Running the CMORization scripts. | ||
... Processing datasets ['FLUXCOM'] | ||
... Input data from: /home/peter/data/RAWOBS/Tier3/FLUXCOM | ||
... Output will be written to: /home/peter/esmvaltool_output/ | ||
data_formatting_20220726_140216/Tier3/FLUXCOM | ||
... Reformat script: /home/peter/mambaforge/envs/esmvaltool/lib/python3.9/ | ||
site-packages/esmvaltool/cmorizers/data/formatters/datasets/fluxcom | ||
... CMORizing dataset FLUXCOM using Python script /home/peter/mambaforge/envs/ | ||
esmvaltool/lib/python3.9/site-packages/esmvaltool/cmorizers/data/formatters/ | ||
datasets/fluxcom.py | ||
... Found input file '/home/peter/data/RAWOBS/Tier3/FLUXCOM/GPP.ANN.CRUNCEPv6.monthly.*.nc' | ||
... Input data from: /work/bd0854/DATA/ESMValTool2/RAWOBS/Tier3/FLUXCOM | ||
... Output will be written to: /scratch/b/b309059/esmvaltool_output/data_formatting_20240527_132448 | ||
/Tier3/FLUXCOM | ||
... Reformat script: /home/b/b309059/ESMValTool/ESMValTool/esmvaltool/cmorizers/data/formatters/ | ||
datasets/fluxcom | ||
... CMORizing dataset FLUXCOM using Python script /home/b/b309059/ESMValTool/ESMValTool/esmvaltool/ | ||
cmorizers/data/formatters/datasets/fluxcom.py | ||
... Found input file '/work/bd0854/DATA/ESMValTool2/RAWOBS/Tier3/FLUXCOM/GPP.ANN.CRUNCEPv6.monthly. | ||
*.nc' | ||
... CMORizing variable 'gpp' | ||
... Lmon | ||
... Var is gpp | ||
... ... UserWarning: Ignoring netCDF variable 'GPP' invalid units 'gC m-2 day-1' | ||
... WARNING /work/bd0854/b309059/utils/mambaforge/envs/esmvaltool/lib/python3.11/site-packages/ | ||
iris/fileformats/_nc_load_rules/helpers.py:913: _WarnComboIgnoringCfLoad: Ignoring invalid u | ||
nits 'gC m-2 day-1' on netCDF variable 'GPP'. | ||
warnings.warn( | ||
|
||
... Fixing time... | ||
... Fixing latitude... | ||
... Fixing longitude... | ||
... Flipping dimensional coordinate latitude... | ||
... Saving file | ||
... Saving: /home/peter/esmvaltool_output/data_formatting_20220726_140216/Tier3/ | ||
FLUXCOM/OBS_FLUXCOM_reanaly_ANN-v1_Lmon_gpp_200001-200012.nc | ||
... Saving: /scratch/b/b309059/esmvaltool_output/data_formatting_20240527_132448/Tier3/FLUXCOM/ | ||
OBS_FLUXCOM_reanaly_ANN-v1_Lmon_gpp_198001-198012.nc | ||
... Cube has lazy data [lazy is preferred] | ||
... WARNING /work/bd0854/b309059/utils/mambaforge/envs/esmvaltool/lib/python3.11/site-packages/ | ||
iris/fileformats/netcdf/saver.py:2670: IrisDeprecation: Saving to netcdf with legacy-style a | ||
ttribute handling for backwards compatibility. | ||
This mode is deprecated since Iris 3.8, and will eventually be removed. | ||
Please consider enabling the new split-attributes handling mode, by setting 'iris.FUTURE. | ||
save_split_attrs = True'. | ||
warn_deprecated(message) | ||
|
||
... CMORization of dataset FLUXCOM finished! | ||
... Formatting successful for dataset FLUXCOM | ||
|
||
~~~ | ||
{: .output} | ||
|
||
|
@@ -193,6 +213,12 @@ You can also see the path where ESMValTool stores the reformatting script: | |
have a look at this file if you want. The script also uses a configuration file: | ||
`~/ESMValTool/esmvaltool/cmorizers/data/cmor_config/FLUXCOM.yml`. | ||
|
||
To get help on CMORizer commands, run the tool with: | ||
|
||
```bash | ||
esmvaltool data --help | ||
``` | ||
|
||
## Make a test recipe | ||
|
||
To verify that the data is correctly CMORized, we will make a simple test | ||
|
@@ -617,17 +643,23 @@ If we now run the test recipe on our newly 'CMORized' data, | |
esmvaltool run recipe_check_fluxcom.yml --config_file <path to config-user.yml> --log_level debug | ||
``` | ||
|
||
it should be able to find the correct file, but it does not succeed yet. The first | ||
thing that the ESMValTool CMOR checker brings up is: | ||
it should be able to find the correct file, but it does not succeed yet. The ESMValTool CMOR checker | ||
brings up is: | ||
|
||
~~~ | ||
iris.exceptions.UnitConversionError: Cannot convert from unknown units. The | ||
"units" attribute may be set directly. | ||
esmvalcore.cmor.check.CMORCheckError: There were errors in variable GPP: | ||
GPP: units should be kg m-2 s-1, not unknown | ||
lon: standard_name should be longitude, not None | ||
lat: standard_name should be latitude, not None | ||
lon: units should be degrees_east, not unknown | ||
lon: has values < valid_min = 0.0 | ||
lat: units should be degrees_north, not unknown | ||
GPP: does not match coordinate rank | ||
~~~ | ||
{: .error} | ||
|
||
If you look closely at the error messages, you can see that this error concerns | ||
the units of the coordinates. ESMValTool tries to fix them automatically, | ||
If you look closely at the error messages, you can see that these error concern | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. suggested change: that these error concern -> the reasons for these errors. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @LisaBock - I am not sure why example_output.txt was touched with this PR as it is not used in the CMORization episode. This is an older version of the file I think. Can you update this from the main repository before pushing? |
||
e.g. the units of the coordinates. ESMValTool tries to fix them automatically, | ||
but since no units are defined on the coordinates, this fails. | ||
|
||
The cmorizer utilities also include a function called `fix_coords`, but before | ||
|
@@ -684,7 +716,7 @@ The next error is: | |
|
||
~~~ | ||
esmvalcore.cmor.check.CMORCheckError: There were errors in variable GPP: | ||
Variable GPP units unknown can not be converted to kg m-2 s-1 in cube: | ||
GPP: units should be kg m-2 s-1, not unknown | ||
~~~ | ||
{: .error} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggested change: dataset does not support -> dataset does not support this option (i.e because all the data is provided as a single file).