From b1da7a19f1be1685663f4322f7867a3ffa8bc8d7 Mon Sep 17 00:00:00 2001 From: Gulcin G Date: Fri, 26 Apr 2024 11:55:28 +0200 Subject: [PATCH 1/7] updated CMORization episode --- _episodes/09-cmorization.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/_episodes/09-cmorization.md b/_episodes/09-cmorization.md index c131c96e..6b58674e 100644 --- a/_episodes/09-cmorization.md +++ b/_episodes/09-cmorization.md @@ -123,6 +123,12 @@ run the CMORizer scripts: esmvaltool data format --config_file ``` +The options `--start` and `--end` can be added to command above to restrict the +formatting of raw data to a time range. They will be ignored if a specific +dataset does not support (i.e. because it is provided as a single file). +Valid formats are `YYYY`, `YYYYMM`, `YYYYMMDD`. The same way is also applicable for +the option `esmvaltool data download`. + The ``config-user.yml`` is the file in which we define the different data paths, see the episode on [Configuration]({{ page.root }}{% link _episodes/03-configuration.md %}). In the ``rootpath`` of your ``config-user.yml``, make sure to add the right @@ -193,6 +199,12 @@ You can also see the path where ESMValTool stores the reformatting script: have a look at this file if you want. The script also uses a configuration file: `~/ESMValTool/esmvaltool/cmorizers/data/cmor_config/FLUXCOM.yml`. +To get help on CMORizer commands, run the tool with: + +```bash +esmvaltool data --help +``` + ## Make a test recipe To verify that the data is correctly CMORized, we will make a simple test From 5be74d56d2b1e004477c6b42fb1c3315ef2bc298 Mon Sep 17 00:00:00 2001 From: Lisa Bock Date: Tue, 28 May 2024 14:50:54 +0200 Subject: [PATCH 2/7] small updates --- _episodes/09-cmorization.md | 58 ++++++++++++++++++++++--------------- 1 file changed, 35 insertions(+), 23 deletions(-) diff --git a/_episodes/09-cmorization.md b/_episodes/09-cmorization.md index b2e93a6b..76e33ae9 100644 --- a/_episodes/09-cmorization.md +++ b/_episodes/09-cmorization.md @@ -147,38 +147,44 @@ name that was created to store the raw observation data files, i.e. If everything is okay, the output should look something like this: ~~~ -... -... Starting the CMORization Tool at time: 2022-07-26 14:02:16 UTC +... Writing program log files to: +/scratch/b/b309059/esmvaltool_output/data_formatting_20240527_132448/run/main_log.txt +/scratch/b/b309059/esmvaltool_output/data_formatting_20240527_132448/run/main_log_debug.txt +... Starting the CMORization Tool at time: 2024-05-27 13:24:48 UTC ... ---------------------------------------------------------------------- -... input_dir = /home/peter/data/RAWOBS -... output_dir = /home/peter/esmvaltool_output/data_formatting_20220726_140216 +... input_dir = /work/bd0854/DATA/ESMValTool2/RAWOBS +... output_dir = /scratch/b/b309059/esmvaltool_output/data_formatting_20240527_132448 ... ---------------------------------------------------------------------- ... Running the CMORization scripts. ... Processing datasets ['FLUXCOM'] -... Input data from: /home/peter/data/RAWOBS/Tier3/FLUXCOM -... Output will be written to: /home/peter/esmvaltool_output/ - data_formatting_20220726_140216/Tier3/FLUXCOM -... Reformat script: /home/peter/mambaforge/envs/esmvaltool/lib/python3.9/ - site-packages/esmvaltool/cmorizers/data/formatters/datasets/fluxcom -... CMORizing dataset FLUXCOM using Python script /home/peter/mambaforge/envs/ - esmvaltool/lib/python3.9/site-packages/esmvaltool/cmorizers/data/formatters/ - datasets/fluxcom.py -... Found input file '/home/peter/data/RAWOBS/Tier3/FLUXCOM/GPP.ANN.CRUNCEPv6.monthly.*.nc' +... Input data from: /work/bd0854/DATA/ESMValTool2/RAWOBS/Tier3/FLUXCOM +... Output will be written to: /scratch/b/b309059/esmvaltool_output/data_formatting_20240527_132448/Tier3/FLUXCOM +... Reformat script: /home/b/b309059/ESMValTool/ESMValTool/esmvaltool/cmorizers/data/formatters/datasets/fluxcom +... CMORizing dataset FLUXCOM using Python script /home/b/b309059/ESMValTool/ESMValTool/esmvaltool/cmorizers/data/formatters/datasets/fluxcom.py +... Found input file '/work/bd0854/DATA/ESMValTool2/RAWOBS/Tier3/FLUXCOM/GPP.ANN.CRUNCEPv6.monthly.*.nc' ... CMORizing variable 'gpp' ... Lmon ... Var is gpp -... ... UserWarning: Ignoring netCDF variable 'GPP' invalid units 'gC m-2 day-1' +... WARNING /work/bd0854/b309059/utils/mambaforge/envs/esmvaltool/lib/python3.11/site-packages/iris/fileformats/_nc_load_rules/helpers.py:913: _WarnComboIgnoringCfLoad: Ignoring invalid u +nits 'gC m-2 day-1' on netCDF variable 'GPP'. + warnings.warn( ... Fixing time... ... Fixing latitude... ... Fixing longitude... ... Flipping dimensional coordinate latitude... ... Saving file -... Saving: /home/peter/esmvaltool_output/data_formatting_20220726_140216/Tier3/ - FLUXCOM/OBS_FLUXCOM_reanaly_ANN-v1_Lmon_gpp_200001-200012.nc +... Saving: /scratch/b/b309059/esmvaltool_output/data_formatting_20240527_132448/Tier3/FLUXCOM/OBS_FLUXCOM_reanaly_ANN-v1_Lmon_gpp_198001-198012.nc ... Cube has lazy data [lazy is preferred] +... WARNING /work/bd0854/b309059/utils/mambaforge/envs/esmvaltool/lib/python3.11/site-packages/iris/fileformats/netcdf/saver.py:2670: IrisDeprecation: Saving to netcdf with legacy-style a +ttribute handling for backwards compatibility. +This mode is deprecated since Iris 3.8, and will eventually be removed. +Please consider enabling the new split-attributes handling mode, by setting 'iris.FUTURE.save_split_attrs = True'. + warn_deprecated(message) + ... CMORization of dataset FLUXCOM finished! ... Formatting successful for dataset FLUXCOM + ~~~ {: .output} @@ -629,17 +635,23 @@ If we now run the test recipe on our newly 'CMORized' data, esmvaltool run recipe_check_fluxcom.yml --config_file --log_level debug ``` -it should be able to find the correct file, but it does not succeed yet. The first -thing that the ESMValTool CMOR checker brings up is: +it should be able to find the correct file, but it does not succeed yet. The ESMValTool CMOR checker +brings up is: ~~~ -iris.exceptions.UnitConversionError: Cannot convert from unknown units. The -"units" attribute may be set directly. +esmvalcore.cmor.check.CMORCheckError: There were errors in variable GPP: + GPP: units should be kg m-2 s-1, not unknown + lon: standard_name should be longitude, not None + lat: standard_name should be latitude, not None + lon: units should be degrees_east, not unknown + lon: has values < valid_min = 0.0 + lat: units should be degrees_north, not unknown + GPP: does not match coordinate rank ~~~ {: .error} -If you look closely at the error messages, you can see that this error concerns -the units of the coordinates. ESMValTool tries to fix them automatically, +If you look closely at the error messages, you can see that these error concern +e.g. the units of the coordinates. ESMValTool tries to fix them automatically, but since no units are defined on the coordinates, this fails. The cmorizer utilities also include a function called `fix_coords`, but before @@ -696,7 +708,7 @@ The next error is: ~~~ esmvalcore.cmor.check.CMORCheckError: There were errors in variable GPP: -Variable GPP units unknown can not be converted to kg m-2 s-1 in cube: + GPP: units should be kg m-2 s-1, not unknown ~~~ {: .error} From 1bc58f6ac33a80406536234706f47d1086c2b9dd Mon Sep 17 00:00:00 2001 From: Lisa Bock Date: Tue, 28 May 2024 14:53:04 +0200 Subject: [PATCH 3/7] update version --- _episodes/09-cmorization.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_episodes/09-cmorization.md b/_episodes/09-cmorization.md index 76e33ae9..cd2b8a1e 100644 --- a/_episodes/09-cmorization.md +++ b/_episodes/09-cmorization.md @@ -2,7 +2,7 @@ title: "CMORization: adding new datasets to ESMValTool" teaching: 15 exercises: 45 -compatibility: ESMValTool v2.6.0 +compatibility: ESMValTool v2.10.0 questions: - "CMORization: what is it and why do we need it?" From d4c9e77cd51feeacf3106ce739a6622ddf44cf8b Mon Sep 17 00:00:00 2001 From: Lisa Bock Date: Tue, 28 May 2024 15:00:11 +0200 Subject: [PATCH 4/7] fix line length --- _episodes/09-cmorization.md | 24 ++++++++++++++++-------- 1 file changed, 16 insertions(+), 8 deletions(-) diff --git a/_episodes/09-cmorization.md b/_episodes/09-cmorization.md index cd2b8a1e..d1fcf5ab 100644 --- a/_episodes/09-cmorization.md +++ b/_episodes/09-cmorization.md @@ -158,14 +158,19 @@ If everything is okay, the output should look something like this: ... Running the CMORization scripts. ... Processing datasets ['FLUXCOM'] ... Input data from: /work/bd0854/DATA/ESMValTool2/RAWOBS/Tier3/FLUXCOM -... Output will be written to: /scratch/b/b309059/esmvaltool_output/data_formatting_20240527_132448/Tier3/FLUXCOM -... Reformat script: /home/b/b309059/ESMValTool/ESMValTool/esmvaltool/cmorizers/data/formatters/datasets/fluxcom -... CMORizing dataset FLUXCOM using Python script /home/b/b309059/ESMValTool/ESMValTool/esmvaltool/cmorizers/data/formatters/datasets/fluxcom.py -... Found input file '/work/bd0854/DATA/ESMValTool2/RAWOBS/Tier3/FLUXCOM/GPP.ANN.CRUNCEPv6.monthly.*.nc' +... Output will be written to: /scratch/b/b309059/esmvaltool_output/data_formatting_20240527_132448 + /Tier3/FLUXCOM +... Reformat script: /home/b/b309059/ESMValTool/ESMValTool/esmvaltool/cmorizers/data/formatters/ + datasets/fluxcom +... CMORizing dataset FLUXCOM using Python script /home/b/b309059/ESMValTool/ESMValTool/esmvaltool/ + cmorizers/data/formatters/datasets/fluxcom.py +... Found input file '/work/bd0854/DATA/ESMValTool2/RAWOBS/Tier3/FLUXCOM/GPP.ANN.CRUNCEPv6.monthly. + *.nc' ... CMORizing variable 'gpp' ... Lmon ... Var is gpp -... WARNING /work/bd0854/b309059/utils/mambaforge/envs/esmvaltool/lib/python3.11/site-packages/iris/fileformats/_nc_load_rules/helpers.py:913: _WarnComboIgnoringCfLoad: Ignoring invalid u +... WARNING /work/bd0854/b309059/utils/mambaforge/envs/esmvaltool/lib/python3.11/site-packages/ + iris/fileformats/_nc_load_rules/helpers.py:913: _WarnComboIgnoringCfLoad: Ignoring invalid u nits 'gC m-2 day-1' on netCDF variable 'GPP'. warnings.warn( @@ -174,12 +179,15 @@ nits 'gC m-2 day-1' on netCDF variable 'GPP'. ... Fixing longitude... ... Flipping dimensional coordinate latitude... ... Saving file -... Saving: /scratch/b/b309059/esmvaltool_output/data_formatting_20240527_132448/Tier3/FLUXCOM/OBS_FLUXCOM_reanaly_ANN-v1_Lmon_gpp_198001-198012.nc +... Saving: /scratch/b/b309059/esmvaltool_output/data_formatting_20240527_132448/Tier3/FLUXCOM/ + OBS_FLUXCOM_reanaly_ANN-v1_Lmon_gpp_198001-198012.nc ... Cube has lazy data [lazy is preferred] -... WARNING /work/bd0854/b309059/utils/mambaforge/envs/esmvaltool/lib/python3.11/site-packages/iris/fileformats/netcdf/saver.py:2670: IrisDeprecation: Saving to netcdf with legacy-style a +... WARNING /work/bd0854/b309059/utils/mambaforge/envs/esmvaltool/lib/python3.11/site-packages/ + iris/fileformats/netcdf/saver.py:2670: IrisDeprecation: Saving to netcdf with legacy-style a ttribute handling for backwards compatibility. This mode is deprecated since Iris 3.8, and will eventually be removed. -Please consider enabling the new split-attributes handling mode, by setting 'iris.FUTURE.save_split_attrs = True'. +Please consider enabling the new split-attributes handling mode, by setting 'iris.FUTURE. +save_split_attrs = True'. warn_deprecated(message) ... CMORization of dataset FLUXCOM finished! From 0ef9a3a7213a721e05023f9a3282e36b93f91a07 Mon Sep 17 00:00:00 2001 From: Lisa Bock Date: Tue, 19 Nov 2024 17:09:30 +0100 Subject: [PATCH 5/7] changes regarding the review --- _episodes/09-cmorization.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/_episodes/09-cmorization.md b/_episodes/09-cmorization.md index d1fcf5ab..60b49af2 100644 --- a/_episodes/09-cmorization.md +++ b/_episodes/09-cmorization.md @@ -125,7 +125,7 @@ esmvaltool data format --config_file The options `--start` and `--end` can be added to command above to restrict the formatting of raw data to a time range. They will be ignored if a specific -dataset does not support (i.e. because it is provided as a single file). +dataset does not support this option (i.e. because all the data is provided as a single file). Valid formats are `YYYY`, `YYYYMM`, `YYYYMMDD`. The same way is also applicable for the option `esmvaltool data download`. @@ -153,7 +153,7 @@ If everything is okay, the output should look something like this: ... Starting the CMORization Tool at time: 2024-05-27 13:24:48 UTC ... ---------------------------------------------------------------------- ... input_dir = /work/bd0854/DATA/ESMValTool2/RAWOBS -... output_dir = /scratch/b/b309059/esmvaltool_output/data_formatting_20240527_132448 +... output_dir = /scratch/b/username/esmvaltool_output/data_formatting_20240527_132448 ... ---------------------------------------------------------------------- ... Running the CMORization scripts. ... Processing datasets ['FLUXCOM'] @@ -658,7 +658,7 @@ esmvalcore.cmor.check.CMORCheckError: There were errors in variable GPP: ~~~ {: .error} -If you look closely at the error messages, you can see that these error concern +If you look closely at the error messages, you can see the reasons for these errors e.g. the units of the coordinates. ESMValTool tries to fix them automatically, but since no units are defined on the coordinates, this fails. From e6f6eec8c630c789695b7823537db68b815bac40 Mon Sep 17 00:00:00 2001 From: Lisa Bock Date: Tue, 19 Nov 2024 17:17:23 +0100 Subject: [PATCH 6/7] change compatibility --- _episodes/09-cmorization.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_episodes/09-cmorization.md b/_episodes/09-cmorization.md index 60b49af2..e3eddc5b 100644 --- a/_episodes/09-cmorization.md +++ b/_episodes/09-cmorization.md @@ -2,7 +2,7 @@ title: "CMORization: adding new datasets to ESMValTool" teaching: 15 exercises: 45 -compatibility: ESMValTool v2.10.0 +compatibility: ESMValTool v2.11.0 questions: - "CMORization: what is it and why do we need it?" From d6ffa26c321fd840f88bdf22b69a1def48db1a56 Mon Sep 17 00:00:00 2001 From: Lisa Bock Date: Thu, 21 Nov 2024 09:24:49 +0100 Subject: [PATCH 7/7] insert 'username' --- _episodes/09-cmorization.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/_episodes/09-cmorization.md b/_episodes/09-cmorization.md index e3eddc5b..2a10c92d 100644 --- a/_episodes/09-cmorization.md +++ b/_episodes/09-cmorization.md @@ -148,8 +148,8 @@ If everything is okay, the output should look something like this: ~~~ ... Writing program log files to: -/scratch/b/b309059/esmvaltool_output/data_formatting_20240527_132448/run/main_log.txt -/scratch/b/b309059/esmvaltool_output/data_formatting_20240527_132448/run/main_log_debug.txt +/scratch/b/username/esmvaltool_output/data_formatting_20240527_132448/run/main_log.txt +/scratch/b/username/esmvaltool_output/data_formatting_20240527_132448/run/main_log_debug.txt ... Starting the CMORization Tool at time: 2024-05-27 13:24:48 UTC ... ---------------------------------------------------------------------- ... input_dir = /work/bd0854/DATA/ESMValTool2/RAWOBS @@ -158,18 +158,18 @@ If everything is okay, the output should look something like this: ... Running the CMORization scripts. ... Processing datasets ['FLUXCOM'] ... Input data from: /work/bd0854/DATA/ESMValTool2/RAWOBS/Tier3/FLUXCOM -... Output will be written to: /scratch/b/b309059/esmvaltool_output/data_formatting_20240527_132448 +... Output will be written to: /scratch/b/username/esmvaltool_output/data_formatting_20240527_132448 /Tier3/FLUXCOM -... Reformat script: /home/b/b309059/ESMValTool/ESMValTool/esmvaltool/cmorizers/data/formatters/ +... Reformat script: /home/b/username/ESMValTool/ESMValTool/esmvaltool/cmorizers/data/formatters/ datasets/fluxcom -... CMORizing dataset FLUXCOM using Python script /home/b/b309059/ESMValTool/ESMValTool/esmvaltool/ +... CMORizing dataset FLUXCOM using Python script /home/b/username/ESMValTool/ESMValTool/esmvaltool/ cmorizers/data/formatters/datasets/fluxcom.py ... Found input file '/work/bd0854/DATA/ESMValTool2/RAWOBS/Tier3/FLUXCOM/GPP.ANN.CRUNCEPv6.monthly. *.nc' ... CMORizing variable 'gpp' ... Lmon ... Var is gpp -... WARNING /work/bd0854/b309059/utils/mambaforge/envs/esmvaltool/lib/python3.11/site-packages/ +... WARNING /work/bd0854/username/utils/mambaforge/envs/esmvaltool/lib/python3.11/site-packages/ iris/fileformats/_nc_load_rules/helpers.py:913: _WarnComboIgnoringCfLoad: Ignoring invalid u nits 'gC m-2 day-1' on netCDF variable 'GPP'. warnings.warn( @@ -179,10 +179,10 @@ nits 'gC m-2 day-1' on netCDF variable 'GPP'. ... Fixing longitude... ... Flipping dimensional coordinate latitude... ... Saving file -... Saving: /scratch/b/b309059/esmvaltool_output/data_formatting_20240527_132448/Tier3/FLUXCOM/ +... Saving: /scratch/b/username/esmvaltool_output/data_formatting_20240527_132448/Tier3/FLUXCOM/ OBS_FLUXCOM_reanaly_ANN-v1_Lmon_gpp_198001-198012.nc ... Cube has lazy data [lazy is preferred] -... WARNING /work/bd0854/b309059/utils/mambaforge/envs/esmvaltool/lib/python3.11/site-packages/ +... WARNING /work/bd0854/username/utils/mambaforge/envs/esmvaltool/lib/python3.11/site-packages/ iris/fileformats/netcdf/saver.py:2670: IrisDeprecation: Saving to netcdf with legacy-style a ttribute handling for backwards compatibility. This mode is deprecated since Iris 3.8, and will eventually be removed.