From ca068d062db763988751259fdfa6916a656dd8df Mon Sep 17 00:00:00 2001 From: Ian Date: Thu, 25 Jul 2024 14:18:44 -0400 Subject: [PATCH 01/15] Update and rename README.md to README_using_fre-cli.md move current README to be a fre-cli-flavored README --- README.md => README_using_fre-cli.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) rename README.md => README_using_fre-cli.md (96%) diff --git a/README.md b/README_using_fre-cli.md similarity index 96% rename from README.md rename to README_using_fre-cli.md index 79d8928..0867772 100644 --- a/README.md +++ b/README_using_fre-cli.md @@ -1,4 +1,6 @@ -# Instructions to postprocess FMS history output on PP/AN or gaea +note these instructions will be/are from https://github.com/NOAA-GFDL/fre-cli/tree/main/fre/pp + +# Instructions to postprocess FMS history output on PP/AN or gaea with fre-cli 1. Checkout postprocessing workflow template This will clone the postprocessing repository into `/home/$USER/cylc-src/EXPNAME__PLATFORM__TARGET`. From c689d2260823d042f885ff2d41d2d6f7769ee536 Mon Sep 17 00:00:00 2001 From: Ian Date: Thu, 25 Jul 2024 14:19:07 -0400 Subject: [PATCH 02/15] Create README.md add old instructions with rose and stuff... etc. --- README.md | 240 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 240 insertions(+) create mode 100644 README.md diff --git a/README.md b/README.md new file mode 100644 index 0000000..89455cb --- /dev/null +++ b/README.md @@ -0,0 +1,240 @@ +# Instructions to postprocess FMS history output on PP/AN + +1. Clone postprocessing template repository + +``` +git clone --recursive https://gitlab.gfdl.noaa.gov/fre2/workflows/postprocessing.git +cd postprocessing +``` +- [+ Do not clone to a temporary directory - the directory in question needs to be available for slum to read from all nodes, and local /vftmp is not. /home, /work, and /xtmp are. +] + +2. Load Cylc, the backend workflow engine used by Canopy + +``` +module load cylc +``` + +3. Create new configuration from empty template, where EXPNAME is the name of your new configuration/experiment + +``` +cp opt/TEMPLATE.conf opt/rose-suite-EXPNAME.conf +``` + +4. Add required configuration items, led by schema prompting + +``` +rose macro --validate +``` + +5. Add configuration items to rose-suite.conf or opt/rose-suite-EXPNAME.conf. + +``` +vi rose-suite.conf # Configuration for all experiments +vi opt/rose-suite-EXPNAME.conf # Configuration for EXPNAME; can override default settings +``` + +Continue to add/uncomment required configuration items, until there are no schema violations. +Use double-quotes in the values! + +Key values include: +- HISTORY_DIR: directory path to your raw model output +- HISTORY_SEGMENT: duration of each history segment (ISO8601) +- PP_CHUNK_A: duration of your desired timeseries (and timeaverages, optionally) +- PP_COMPONENTS: string-separated list of user-defined components +- PP_START: start of the desired postprocssing (ISO8601) +- PP_STOP: end of the desired postprocessing (ISO8601) + +Other currently required values include: +- DEFAULT_XY_INTERP: e.g. "288,180". This is the default regridded grid. +- FRE_ANALYSIS_HOME: For locating shared analysis scripts. (Should not be required unless DO_ANALYSIS, however) +- PP_GRID_SPEC: filepath to FMS grid definition tarfile +- SITE: set to "ppan" to submit jobs to PP/AN cluster + +6. Configure your postprocessing components + +A postprocessed component, originally defined in FRE Bronx and used in Canopy, is a user-defined label that has two main qualities: +- a single target horizontal grid: i.e. native atmosphere; native ocean; or regridded spherical (lat/lon) +- history files that should be included in the component + +FMS history files are limited to a single time dimension, so commonly, multiple history files are mapped to +a single postprocess component. In the following examples, we will create an `atmos` component as a 1x1-degree regridded grid composed of the history files `atmos_month` and `atmos_daily`; a `land` component is regridded to a reduced 2-degree grid, and should contain `land_month`, `land_daily`, and `land_static`. Finally, a `atmos_scalar` component should be left on the native grid, and should contain `atmos_scalar` and `atmos_global_cmip`. + +The steps for postprocess component configuration are: +1. Set the PP components you wish to process in the PP_COMPONENTS in the rose-suite file(s) described above. +2. Define the history file mapping in `app/remap-pp-components/rose-app.conf` and the regridding details in +`app/regrid-xy/rose-app.conf`. +3. Use `rose macro --validate` throughout for configuration validation. + +For example, to postprocess the 3 components "atmos", "land", and "atmos_scalar", set in your opt/rose-suite-LABEL.conf file: + +``` +PP_COMPONENTS="atmos land atmos_scalar" +``` + +Then, let `rose macro --validate` advise your edits. When the validation errors go away, your configuration is valid and consistent. +After setting the `PP_COMPONENTS` above, the configuration validation will ensure configuration consistency and completeness. + +``` +rose macro --validate + +[V] components.ComponentChecker: issues: 3 + (opts=TEMPLATE)template variables=PP_COMPONENTS=atmos land atmos_scalar + Requested component 'atmos' is not defined in /nbhome/c2b/git/fre2/workflows/postprocessing/meta/lib/python/macros/../../../../app/remap-pp-components/rose-app.conf + (opts=TEMPLATE)template variables=PP_COMPONENTS=atmos land atmos_scalar + Requested component 'land' is not defined in /nbhome/c2b/git/fre2/workflows/postprocessing/meta/lib/python/macros/../../../../app/remap-pp-components/rose-app.conf + (opts=TEMPLATE)template variables=PP_COMPONENTS=atmos land atmos_scalar + Requested component 'atmos_scalar' is not defined in /nbhome/c2b/git/fre2/workflows/postprocessing/meta/lib/python/macros/../../../../app/remap-pp-components/rose-app.conf +``` + +To create your desired history file to postprocess component remapping, it's helpful (i.e. until history file manifests exist) to list the contents of the history tarfile in order to create your postprocessing configuration. + +``` +tar -tf /path/to/history/YYYYMMDD.nc.tar | grep -v "tile[2-6]" | sort +``` + +Each history file reported above may be included in your postprocess components. For example, here is a remap-pp-components/rose-app.conf file: + +``` +[atmos] +sources=atmos_month + atmos_daily +grid=regrid-xy/default + +[land] +sources=land_month_cmip + land_daily_cmip +grid=regrid-xy/2deg + +[land.static] +sources=land_static +grid=regrid-xy/default +freq=P0Y + +[atmos_scalar] +sources=atmos_scalar + atmos_global_cmip +grid=native +``` + +Explanation / discussion: +- The Rose configuation file format is described here: https://metomi.github.io/rose/doc/html/api/configuration/rose-configuration-format.html +- The header sections identify the PP components. PP components may not contain periods! Any text after a period is optional, and merely serves to allow another section for remapping. (e.g. "land" and "land.static" identify the "land" component) +- The "grid" attribute should be either "native", "regrid-xy/default", or "regrid-xy/LABEL". The regrid-xy label (default or user-defined) are defined in the app/regrid-xy/rose-app.conf file, described next. +- The "freq" attribute has special meaning and is needed for static processing. If "freq" is set to "0PY", only the static +variables will be remapped. If "freq" is unset, then all temporal frequencies are included. + +After adding your entries to `app/remap-pp-components/rose-app.conf`, run `rose macro --validate` again: + +``` +rose macro --validate + +(opts=TEMPLATE)template variables=PP_COMPONENTS=atmos land atmos_scalar +Requested component 'atmos' uses history file 'atmos_daily' with regridding label 'regrid-xy/default', but this was not found in app/regrid-xy/rose-app.conf +(opts=TEMPLATE)template variables=PP_COMPONENTS=atmos land atmos_scalar +Requested component 'atmos' uses history file 'atmos_month' with regridding label 'regrid-xy/default', but this was not found in app/regrid-xy/rose-app.conf +(opts=TEMPLATE)template variables=PP_COMPONENTS=atmos land atmos_scalar +Requested component 'land' uses history file 'land_daily_cmip' with regridding label 'regrid-xy/2deg', but this was not found in app/regrid-xy/rose-app.conf +(opts=TEMPLATE)template variables=PP_COMPONENTS=atmos land atmos_scalar +Requested component 'land' uses history file 'land_month_cmip' with regridding label 'regrid-xy/2deg', but this was not found in app/regrid-xy/rose-app.conf +(opts=TEMPLATE)template variables=PP_COMPONENTS=atmos land atmos_scalar +Requested component 'land' uses history file 'land_static' with regridding label 'regrid-xy/2deg', but this was not found in app/regrid-xy/rose-app.conf +``` + +Now add corresponding regridding instructions for the regridding labels. This can be added to `app/regrid-xy/rose-app.conf`: + +``` +[atmos] +inputGrid=cubedsphere +inputRealm=atmos +interpMethod=conserve_order2 +outputGridType=default +sources=atmos_month + atmos_daily + +[land] +inputGrid=cubedsphere +inputRealm=land +interpMethod=conserve_order1 +outputGridLon=144 +outputGridLat=90 +outputGridType=2deg +sources=land_month_cmip + land_daily_cmip + land_static +``` + +Explanation / discussion: +- The header sections identify the regridding instructions for a list of history files, and do not have meaning other than being unique. +- The inputGrid attribute should be "cubedsphere" or "tripolar". +- The inputRealm attribute is used for identifying the land or atmos grid mosaic file: should be "atmos", "land", or "ocean". +- The interpMethod should be "conserve_order1", "conserve_order2", or "bilinear." +- OutputGridType is the grid label referenced in the `app/remap-pp-components/rose-app.conf` file. +- If OutputGridType is "default", then the DEFAULT_XY_INTERP setting is used. Otherwise, OutputGridLat and OutputGridLon identify the target grid. + +7. Optionally, report on history files that may be missing + +Generate a "history manifest" file by listing the contents of a history tarfile to a file called 'history-manifest'. + +``` +tar -tf /path/to/history/YYYYMMDD.nc.tar | grep -v "tile[2-6]" | sort > history-manifest +``` + +If `history-manifest` exists, `rose macro --validate` will report on history files referenced but not present. + +Probably, you should remove components that specify non-existent history files, reconfigure the component definition, +or trust that the missing history files will be created by a refineDiag script. + + +8. Validate the configuration + +`rose macro --validate` should report no errors. + +Then, validate the Cylc configuration: + +`bin/validate-exp EXPNAME` + +Please complain (to a Canopy developer) or take a note if if the Cylc validation fails but the Rose validation passes, +as this may expose some internal problems or quoting issues. + +9. Install the workflow + +``` +bin/install-exp EXPNAME +``` + +This installs the workflow run directory in `~/cylc-run//runN`, where N is an incrementing number (like FRE --unique). The various `cylc` commands act on the most recent `runN` by default. + +10. Run the workflow + +``` +cylc play EXPNAME +``` + +The workflow runs a daemon on the `workflow1` server (via `ssh`, so you see the login banner). + +11. Inspect workflow progress + +The workflow will run and shutdown when all tasks are complete. If tasks fail, the workflow may stall, in which case +it will shutdown in error after a period of time. + +Cylc has two workflow viewing interfaces (full GUI and text UI), and a variety of CLI commands that can expose workflow +and task information. The text-based GUI can be launched via: + +``` +cylc tui EXPNAME +``` + +The full GUI can be launched on jhan or jhanbigmem (an107 or an201). +``` +cylc gui --ip=`hostname -f` --port=`jhp 1` --no-browser +``` + +Then, navigate to one of the two links printed to screen in your web browser + +Various other cylc commands are useful for inspecting a running workflow. Try "cylc help". + +- cylc scan: Lists running workflows +- cylc workflow-state EXPNAME: Lists all task states +- cylc cat-log EXPNAME: Show the scheduler log +- cylc list EXPNAME: Lists all tasks +- cylc report-timings EXPNAME From 01245a0fd5e40ba287aef90719d98fe61f0d7514 Mon Sep 17 00:00:00 2001 From: Ian Date: Thu, 25 Jul 2024 14:23:52 -0400 Subject: [PATCH 03/15] Update README.md --- README.md | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 89455cb..538734a 100644 --- a/README.md +++ b/README.md @@ -1,9 +1,13 @@ + + # Instructions to postprocess FMS history output on PP/AN 1. Clone postprocessing template repository ``` -git clone --recursive https://gitlab.gfdl.noaa.gov/fre2/workflows/postprocessing.git +git clone https://github.com/NOAA-GFDL/fre-workflows.git cd postprocessing ``` - [+ Do not clone to a temporary directory - the directory in question needs to be available for slum to read from all nodes, and local /vftmp is not. /home, /work, and /xtmp are. +] @@ -15,19 +19,19 @@ module load cylc ``` 3. Create new configuration from empty template, where EXPNAME is the name of your new configuration/experiment - +this step should be updated i think ``` cp opt/TEMPLATE.conf opt/rose-suite-EXPNAME.conf ``` 4. Add required configuration items, led by schema prompting - +while we're still slightly dependent on rose ``` rose macro --validate ``` 5. Add configuration items to rose-suite.conf or opt/rose-suite-EXPNAME.conf. - +this step should be updated i think ``` vi rose-suite.conf # Configuration for all experiments vi opt/rose-suite-EXPNAME.conf # Configuration for EXPNAME; can override default settings From b0eb0c5928e4345c555f0378274b0c71e0a5400d Mon Sep 17 00:00:00 2001 From: Ian Date: Wed, 31 Jul 2024 08:31:32 -0400 Subject: [PATCH 04/15] Delete .gitmodules no longer using a submodule approach --- .gitmodules | 0 1 file changed, 0 insertions(+), 0 deletions(-) delete mode 100644 .gitmodules diff --git a/.gitmodules b/.gitmodules deleted file mode 100644 index e69de29..0000000 From 367884570e6ded0cfb6b7bd2f1b6b22914b97d6a Mon Sep 17 00:00:00 2001 From: Ian Date: Wed, 31 Jul 2024 09:56:17 -0400 Subject: [PATCH 05/15] Update README.md formatting and edits, clean up. --- README.md | 46 ++++++++++++++++++++++++---------------------- 1 file changed, 24 insertions(+), 22 deletions(-) diff --git a/README.md b/README.md index 538734a..fc4cb4d 100644 --- a/README.md +++ b/README.md @@ -2,41 +2,46 @@ based on: https://gitlab.gfdl.noaa.gov/fre2/workflows/postprocessing/-/raw/d51df76e537222a3c78405b5749fe59306e6d2bd/README.md --> -# Instructions to postprocess FMS history output on PP/AN -1. Clone postprocessing template repository +# Instructions to postprocess FMS history files on GFDL's PP/AN +### 1. Clone `fre-workflows` repository ``` git clone https://github.com/NOAA-GFDL/fre-workflows.git -cd postprocessing +cd fre-workflows ``` -- [+ Do not clone to a temporary directory - the directory in question needs to be available for slum to read from all nodes, and local /vftmp is not. /home, /work, and /xtmp are. +] +**Do not clone to a temporary directory** - the directory in question needs to be available for slurm to read from all nodes, +and local /vftmp is not. /home, /work, and /xtmp are.** -2. Load Cylc, the backend workflow engine used by Canopy +### 2. Load Cylc, the backend workflow engine used by Canopy ``` module load cylc ``` +`cylc` lets us parse workflow template files (`*.cylc`) and their configurations into modular, interdependent batch jobs. Tools +used by those jobs (e.g. `fre-nctools` or `xarray`) should be loaded by those jobs as part of their requirements and do not +need to be loaded at this time unless desired by the user. -3. Create new configuration from empty template, where EXPNAME is the name of your new configuration/experiment + +### 3. UPDATEME Create new configuration from empty template, where EXPNAME is the name of your new configuration/experiment this step should be updated i think ``` cp opt/TEMPLATE.conf opt/rose-suite-EXPNAME.conf ``` -4. Add required configuration items, led by schema prompting + +### 4. UPDATEMEAdd required configuration items, led by schema prompting while we're still slightly dependent on rose ``` rose macro --validate ``` -5. Add configuration items to rose-suite.conf or opt/rose-suite-EXPNAME.conf. -this step should be updated i think + +### 5. UPDATEME Add configuration items to rose-suite.conf or opt/rose-suite-EXPNAME.conf. ``` vi rose-suite.conf # Configuration for all experiments vi opt/rose-suite-EXPNAME.conf # Configuration for EXPNAME; can override default settings ``` - Continue to add/uncomment required configuration items, until there are no schema violations. Use double-quotes in the values! @@ -54,7 +59,8 @@ Other currently required values include: - PP_GRID_SPEC: filepath to FMS grid definition tarfile - SITE: set to "ppan" to submit jobs to PP/AN cluster -6. Configure your postprocessing components + +### 6. UPDATEME Configure your postprocessing components A postprocessed component, originally defined in FRE Bronx and used in Canopy, is a user-defined label that has two main qualities: - a single target horizontal grid: i.e. native atmosphere; native ocean; or regridded spherical (lat/lon) @@ -175,10 +181,9 @@ Explanation / discussion: - OutputGridType is the grid label referenced in the `app/remap-pp-components/rose-app.conf` file. - If OutputGridType is "default", then the DEFAULT_XY_INTERP setting is used. Otherwise, OutputGridLat and OutputGridLon identify the target grid. -7. Optionally, report on history files that may be missing +### 7. UPDATEME Optionally, report on history files that may be missing Generate a "history manifest" file by listing the contents of a history tarfile to a file called 'history-manifest'. - ``` tar -tf /path/to/history/YYYYMMDD.nc.tar | grep -v "tile[2-6]" | sort > history-manifest ``` @@ -189,41 +194,36 @@ Probably, you should remove components that specify non-existent history files, or trust that the missing history files will be created by a refineDiag script. -8. Validate the configuration - +### 8. UPDATEME Validate the configuration `rose macro --validate` should report no errors. Then, validate the Cylc configuration: - `bin/validate-exp EXPNAME` Please complain (to a Canopy developer) or take a note if if the Cylc validation fails but the Rose validation passes, as this may expose some internal problems or quoting issues. -9. Install the workflow +### 9. UPDATEME Install the workflow ``` bin/install-exp EXPNAME ``` - This installs the workflow run directory in `~/cylc-run//runN`, where N is an incrementing number (like FRE --unique). The various `cylc` commands act on the most recent `runN` by default. -10. Run the workflow +### 10. UPDATEME Run the workflow ``` cylc play EXPNAME ``` - The workflow runs a daemon on the `workflow1` server (via `ssh`, so you see the login banner). -11. Inspect workflow progress +### 11. UPDATEME Inspect workflow progress with an interface (GUI or TUI) The workflow will run and shutdown when all tasks are complete. If tasks fail, the workflow may stall, in which case it will shutdown in error after a period of time. Cylc has two workflow viewing interfaces (full GUI and text UI), and a variety of CLI commands that can expose workflow and task information. The text-based GUI can be launched via: - ``` cylc tui EXPNAME ``` @@ -235,6 +235,8 @@ cylc gui --ip=`hostname -f` --port=`jhp 1` --no-browser Then, navigate to one of the two links printed to screen in your web browser + +### 12. UPDATEME Inspect workflow progress with a terminal CLI Various other cylc commands are useful for inspecting a running workflow. Try "cylc help". - cylc scan: Lists running workflows From 3b5f2077c91bcfd727ec5737099fe9902223fc1c Mon Sep 17 00:00:00 2001 From: Ian Date: Wed, 31 Jul 2024 10:51:38 -0400 Subject: [PATCH 06/15] Update README.md another pass- more formatting, backticks for code objects or variables or field values etc. break up some long lines. --- README.md | 90 +++++++++++++++++++++++++++++-------------------------- 1 file changed, 48 insertions(+), 42 deletions(-) diff --git a/README.md b/README.md index fc4cb4d..0d994f1 100644 --- a/README.md +++ b/README.md @@ -11,16 +11,16 @@ git clone https://github.com/NOAA-GFDL/fre-workflows.git cd fre-workflows ``` **Do not clone to a temporary directory** - the directory in question needs to be available for slurm to read from all nodes, -and local /vftmp is not. /home, /work, and /xtmp are.** +and local `/vftmp` is not. `/home`, `/work`, and `/xtmp` are. ### 2. Load Cylc, the backend workflow engine used by Canopy ``` module load cylc ``` -`cylc` lets us parse workflow template files (`*.cylc`) and their configurations into modular, interdependent batch jobs. Tools -used by those jobs (e.g. `fre-nctools` or `xarray`) should be loaded by those jobs as part of their requirements and do not -need to be loaded at this time unless desired by the user. +[`cylc`](https://cylc.github.io/cylc-doc/stable/html/) lets us parse workflow template files (`*.cylc`) and their +configurations into modular, interdependent batch jobs. Tools used by those jobs (e.g. `fre-nctools` or `xarray`) should be +loaded by those jobs as part of their requirements and do not need to be loaded at this time unless desired. ### 3. UPDATEME Create new configuration from empty template, where EXPNAME is the name of your new configuration/experiment @@ -46,43 +46,48 @@ Continue to add/uncomment required configuration items, until there are no schem Use double-quotes in the values! Key values include: -- HISTORY_DIR: directory path to your raw model output -- HISTORY_SEGMENT: duration of each history segment (ISO8601) -- PP_CHUNK_A: duration of your desired timeseries (and timeaverages, optionally) -- PP_COMPONENTS: string-separated list of user-defined components -- PP_START: start of the desired postprocssing (ISO8601) -- PP_STOP: end of the desired postprocessing (ISO8601) +- `HISTORY_DIR` directory path to your raw model output +- `HISTORY_SEGMENT` duration of each history segment (ISO8601) +- `PP_CHUNK_A` duration of your desired timeseries (and timeaverages, optionally) +- `PP_COMPONENTS` string-separated list of user-defined components +- `PP_START` start of the desired postprocssing (ISO8601) +- `PP_STOP` end of the desired postprocessing (ISO8601) Other currently required values include: -- DEFAULT_XY_INTERP: e.g. "288,180". This is the default regridded grid. -- FRE_ANALYSIS_HOME: For locating shared analysis scripts. (Should not be required unless DO_ANALYSIS, however) -- PP_GRID_SPEC: filepath to FMS grid definition tarfile -- SITE: set to "ppan" to submit jobs to PP/AN cluster +- `DEFAULT_XY_INTERP` e.g. `"288,180"`. This is the default regridded grid. +- `FRE_ANALYSIS_HOME` For locating shared analysis scripts. (Should not be required unless `DO_ANALYSIS`, however) +- `PP_GRID_SPEC` filepath to FMS grid definition tarfile +- `SITE` set to "ppan" to submit jobs to PP/AN cluster ### 6. UPDATEME Configure your postprocessing components -A postprocessed component, originally defined in FRE Bronx and used in Canopy, is a user-defined label that has two main qualities: +A postprocessed component, originally defined in FRE Bronx and used in Canopy, is a user-defined label that has two main +qualities: - a single target horizontal grid: i.e. native atmosphere; native ocean; or regridded spherical (lat/lon) - history files that should be included in the component FMS history files are limited to a single time dimension, so commonly, multiple history files are mapped to -a single postprocess component. In the following examples, we will create an `atmos` component as a 1x1-degree regridded grid composed of the history files `atmos_month` and `atmos_daily`; a `land` component is regridded to a reduced 2-degree grid, and should contain `land_month`, `land_daily`, and `land_static`. Finally, a `atmos_scalar` component should be left on the native grid, and should contain `atmos_scalar` and `atmos_global_cmip`. +a single postprocess component. In the following examples, we will create an `atmos` component as a 1x1-degree regridded +grid composed of the history files `atmos_month` and `atmos_daily`; a `land` component is regridded to a reduced 2-degree +grid, and should contain `land_month`, `land_daily`, and `land_static`. Finally, a `atmos_scalar` component should be left +on the native grid, and should contain `atmos_scalar` and `atmos_global_cmip`. The steps for postprocess component configuration are: -1. Set the PP components you wish to process in the PP_COMPONENTS in the rose-suite file(s) described above. +1. Set the PP components you wish to process in the `PP_COMPONENTS` in the rose-suite file(s) described above. 2. Define the history file mapping in `app/remap-pp-components/rose-app.conf` and the regridding details in `app/regrid-xy/rose-app.conf`. 3. Use `rose macro --validate` throughout for configuration validation. -For example, to postprocess the 3 components "atmos", "land", and "atmos_scalar", set in your opt/rose-suite-LABEL.conf file: +For example, to postprocess the 3 components `atmos`, `land`, and `atmos_scalar`, set in your `opt/rose-suite-LABEL.conf` file: ``` PP_COMPONENTS="atmos land atmos_scalar" ``` -Then, let `rose macro --validate` advise your edits. When the validation errors go away, your configuration is valid and consistent. -After setting the `PP_COMPONENTS` above, the configuration validation will ensure configuration consistency and completeness. +Then, let `rose macro --validate` advise your edits. When the validation errors go away, your configuration is valid and +consistent. After setting the `PP_COMPONENTS` above, the configuration validation will ensure configuration consistency and +completeness. ``` rose macro --validate @@ -96,14 +101,15 @@ rose macro --validate Requested component 'atmos_scalar' is not defined in /nbhome/c2b/git/fre2/workflows/postprocessing/meta/lib/python/macros/../../../../app/remap-pp-components/rose-app.conf ``` -To create your desired history file to postprocess component remapping, it's helpful (i.e. until history file manifests exist) to list the contents of the history tarfile in order to create your postprocessing configuration. +To create your desired history file to postprocess component remapping, it's helpful (i.e. until history file manifests exist) to +list the contents of the history tarfile in order to create your postprocessing configuration. ``` tar -tf /path/to/history/YYYYMMDD.nc.tar | grep -v "tile[2-6]" | sort ``` -Each history file reported above may be included in your postprocess components. For example, here is a remap-pp-components/rose-app.conf file: - +Each history file reported above may be included in your postprocess components. For example, here is an +`app/remap-pp-components/rose-app.conf` file: ``` [atmos] sources=atmos_month @@ -127,11 +133,11 @@ grid=native ``` Explanation / discussion: -- The Rose configuation file format is described here: https://metomi.github.io/rose/doc/html/api/configuration/rose-configuration-format.html -- The header sections identify the PP components. PP components may not contain periods! Any text after a period is optional, and merely serves to allow another section for remapping. (e.g. "land" and "land.static" identify the "land" component) -- The "grid" attribute should be either "native", "regrid-xy/default", or "regrid-xy/LABEL". The regrid-xy label (default or user-defined) are defined in the app/regrid-xy/rose-app.conf file, described next. -- The "freq" attribute has special meaning and is needed for static processing. If "freq" is set to "0PY", only the static -variables will be remapped. If "freq" is unset, then all temporal frequencies are included. +- The Rose configuation file format is described (here)[https://metomi.github.io/rose/doc/html/api/configuration/rose-configuration-format.html] +- The header sections identify the PP components. PP components may not contain periods! Any text after a period is optional, and merely serves to allow another section for remapping. (e.g. `land` and `land.static` identify the `land` component) +- - The `grid` attribute should be either "native", "regrid-xy/default", or "regrid-xy/LABEL". The regrid-xy label (default or user-defined) are defined in the `app/regrid-xy/rose-app.conf` file, described next. +- The `freq` attribute has special meaning and is needed for static processing. If `freq` is set to `0PY`, only the static +variables will be remapped. If `freq` is unset, then all temporal frequencies are included. After adding your entries to `app/remap-pp-components/rose-app.conf`, run `rose macro --validate` again: @@ -175,11 +181,11 @@ sources=land_month_cmip Explanation / discussion: - The header sections identify the regridding instructions for a list of history files, and do not have meaning other than being unique. -- The inputGrid attribute should be "cubedsphere" or "tripolar". -- The inputRealm attribute is used for identifying the land or atmos grid mosaic file: should be "atmos", "land", or "ocean". -- The interpMethod should be "conserve_order1", "conserve_order2", or "bilinear." -- OutputGridType is the grid label referenced in the `app/remap-pp-components/rose-app.conf` file. -- If OutputGridType is "default", then the DEFAULT_XY_INTERP setting is used. Otherwise, OutputGridLat and OutputGridLon identify the target grid. +- The `inputGrid` attribute should be `cubedsphere` or `tripolar`. +- The `inputRealm` attribute is used for identifying the land or atmos grid mosaic file: should be `atmos`, `land`, or `ocean`. +- The `interpMethod` should be `conserve_order1`, `conserve_order2`, or `bilinear`. +- `OutputGridType` is the grid label referenced in the `app/remap-pp-components/rose-app.conf` file. +- If `OutputGridType` is `default`, then the `DEFAULT_XY_INTERP` setting is used. Otherwise, `OutputGridLat` and `OutputGridLon` identify the target grid. ### 7. UPDATEME Optionally, report on history files that may be missing @@ -208,7 +214,8 @@ as this may expose some internal problems or quoting issues. ``` bin/install-exp EXPNAME ``` -This installs the workflow run directory in `~/cylc-run//runN`, where N is an incrementing number (like FRE --unique). The various `cylc` commands act on the most recent `runN` by default. +This installs the workflow run directory in `~/cylc-run//runN`, where `N` is an incrementing number (like `FRE --unique`). +The various `cylc` commands act on the most recent `runN` by default. ### 10. UPDATEME Run the workflow @@ -222,7 +229,7 @@ The workflow runs a daemon on the `workflow1` server (via `ssh`, so you see the The workflow will run and shutdown when all tasks are complete. If tasks fail, the workflow may stall, in which case it will shutdown in error after a period of time. -Cylc has two workflow viewing interfaces (full GUI and text UI), and a variety of CLI commands that can expose workflow +`cylc` has two workflow viewing interfaces (full GUI and text UI), and a variety of CLI commands that can expose workflow and task information. The text-based GUI can be launched via: ``` cylc tui EXPNAME @@ -232,15 +239,14 @@ The full GUI can be launched on jhan or jhanbigmem (an107 or an201). ``` cylc gui --ip=`hostname -f` --port=`jhp 1` --no-browser ``` - Then, navigate to one of the two links printed to screen in your web browser ### 12. UPDATEME Inspect workflow progress with a terminal CLI -Various other cylc commands are useful for inspecting a running workflow. Try "cylc help". +Various other `cylc` commands are useful for inspecting a running workflow. Try `cylc help`. -- cylc scan: Lists running workflows -- cylc workflow-state EXPNAME: Lists all task states -- cylc cat-log EXPNAME: Show the scheduler log -- cylc list EXPNAME: Lists all tasks -- cylc report-timings EXPNAME +- `cylc scan` Lists running workflows +- `cylc workflow-state EXPNAME` Lists all task states +- `cylc cat-log EXPNAME` Show the scheduler log +- `cylc list EXPNAME` Lists all tasks +- `cylc report-timings EXPNAME` From 0ee0dc24cce66e468b1e7e868d73b3196ac5d55a Mon Sep 17 00:00:00 2001 From: Ian Date: Wed, 31 Jul 2024 14:36:36 -0400 Subject: [PATCH 07/15] Update README.md intermediate commit --- README.md | 54 +++++++++++++++++++++++++++--------------------------- 1 file changed, 27 insertions(+), 27 deletions(-) diff --git a/README.md b/README.md index 0d994f1..bca9e95 100644 --- a/README.md +++ b/README.md @@ -2,48 +2,30 @@ based on: https://gitlab.gfdl.noaa.gov/fre2/workflows/postprocessing/-/raw/d51df76e537222a3c78405b5749fe59306e6d2bd/README.md --> - # Instructions to postprocess FMS history files on GFDL's PP/AN +These instructions are targeted at workflow developers. If you are a user simply looking to run a specific workflow, see +(`fre-cli`)[https://github.com/NOAA-GFDL/fre-cli] ### 1. Clone `fre-workflows` repository ``` git clone https://github.com/NOAA-GFDL/fre-workflows.git cd fre-workflows ``` -**Do not clone to a temporary directory** - the directory in question needs to be available for slurm to read from all nodes, -and local `/vftmp` is not. `/home`, `/work`, and `/xtmp` are. +**Do not clone to a temporary directory** - the directory must be readable by slurm from all nodes. Directories on local +`\vftmp` are not, while those on `/home`, `/work`, and `/xtmp` are. -### 2. Load Cylc, the backend workflow engine used by Canopy +### 2. Load Cylc, the backend workflow engine used by FRE2 ``` module load cylc ``` [`cylc`](https://cylc.github.io/cylc-doc/stable/html/) lets us parse workflow template files (`*.cylc`) and their -configurations into modular, interdependent batch jobs. Tools used by those jobs (e.g. `fre-nctools` or `xarray`) should be -loaded by those jobs as part of their requirements and do not need to be loaded at this time unless desired. - +configurations into modular, interdependent batch jobs. Tools used by those jobs (e.g. `fre-nctools` or `xarray`) should +be loaded by those jobs as part of their requirements and do not need to be loaded at this time unless desired. -### 3. UPDATEME Create new configuration from empty template, where EXPNAME is the name of your new configuration/experiment -this step should be updated i think -``` -cp opt/TEMPLATE.conf opt/rose-suite-EXPNAME.conf -``` - -### 4. UPDATEMEAdd required configuration items, led by schema prompting -while we're still slightly dependent on rose -``` -rose macro --validate -``` - - -### 5. UPDATEME Add configuration items to rose-suite.conf or opt/rose-suite-EXPNAME.conf. -``` -vi rose-suite.conf # Configuration for all experiments -vi opt/rose-suite-EXPNAME.conf # Configuration for EXPNAME; can override default settings -``` -Continue to add/uncomment required configuration items, until there are no schema violations. -Use double-quotes in the values! +### 3. Configure your workflow using available fields +With your favorite text editor, open up `rose-suite.conf` and set variables to desired values. Key values include: - `HISTORY_DIR` directory path to your raw model output @@ -60,6 +42,24 @@ Other currently required values include: - `SITE` set to "ppan" to submit jobs to PP/AN cluster +### 4. Validate your workflow configuration +When you are ready, you can have rose validate your configuration to catch common problems: +``` +rose macro --validate +``` +If there are any errors, try to address them. Common errors include non-existent directories and time intervals that +do not follow ISO8601 specifications. Iterate between editing your configuration and validating with rose until +all complaints are addressed. + + + + + + + + + + ### 6. UPDATEME Configure your postprocessing components A postprocessed component, originally defined in FRE Bronx and used in Canopy, is a user-defined label that has two main From 522dc3f806a45883a011ec029130c5f9bb5c57ef Mon Sep 17 00:00:00 2001 From: Ian Date: Wed, 31 Jul 2024 16:13:22 -0400 Subject: [PATCH 08/15] Update README.md another intermediate commit for sanity --- README.md | 223 +++++++++++++++++++++++++++++------------------------- 1 file changed, 121 insertions(+), 102 deletions(-) diff --git a/README.md b/README.md index bca9e95..ec8366c 100644 --- a/README.md +++ b/README.md @@ -1,11 +1,19 @@ +This repository holds code for defining tasks, applications, tools, workflows, and other aspects of the FRE2 postprocessing +ecosystem. + + + # Instructions to postprocess FMS history files on GFDL's PP/AN These instructions are targeted at workflow developers. If you are a user simply looking to run a specific workflow, see -(`fre-cli`)[https://github.com/NOAA-GFDL/fre-cli] +[`fre-cli`](https://github.com/NOAA-GFDL/fre-cli). + + + ### 1. Clone `fre-workflows` repository ``` git clone https://github.com/NOAA-GFDL/fre-workflows.git @@ -15,6 +23,8 @@ cd fre-workflows `\vftmp` are not, while those on `/home`, `/work`, and `/xtmp` are. + + ### 2. Load Cylc, the backend workflow engine used by FRE2 ``` module load cylc @@ -24,98 +34,61 @@ configurations into modular, interdependent batch jobs. Tools used by those jobs be loaded by those jobs as part of their requirements and do not need to be loaded at this time unless desired. -### 3. Configure your workflow using available fields -With your favorite text editor, open up `rose-suite.conf` and set variables to desired values. + + + +### 3. Fill in rose-suite configuration fields +With your favorite text editor, open up `rose-suite.conf` and set variables to desired values. These values will be passed to +task definitions within `flow.cylc` and taken as configuration settings to instruct task execution. Key values include: +- `SITE` set to "ppan" to submit jobs to PP/AN cluster - `HISTORY_DIR` directory path to your raw model output - `HISTORY_SEGMENT` duration of each history segment (ISO8601) - `PP_CHUNK_A` duration of your desired timeseries (and timeaverages, optionally) -- `PP_COMPONENTS` string-separated list of user-defined components - `PP_START` start of the desired postprocssing (ISO8601) - `PP_STOP` end of the desired postprocessing (ISO8601) - -Other currently required values include: -- `DEFAULT_XY_INTERP` e.g. `"288,180"`. This is the default regridded grid. -- `FRE_ANALYSIS_HOME` For locating shared analysis scripts. (Should not be required unless `DO_ANALYSIS`, however) +- `PP_COMPONENTS` space-separated list of user-defined components - `PP_GRID_SPEC` filepath to FMS grid definition tarfile -- `SITE` set to "ppan" to submit jobs to PP/AN cluster - - -### 4. Validate your workflow configuration -When you are ready, you can have rose validate your configuration to catch common problems: -``` -rose macro --validate -``` -If there are any errors, try to address them. Common errors include non-existent directories and time intervals that -do not follow ISO8601 specifications. Iterate between editing your configuration and validating with rose until -all complaints are addressed. - - - - - +- `DEFAULT_XY_INTERP` e.g. `"288,180"`, default target resolution for regridded data. +- `FRE_ANALYSIS_HOME` For locating shared analysis scripts (only define if `DO_ANALYSIS=True`) +It is common to not know exactly what to set `PP_COMPONENTS` to when configuring a new workflow from scratch. Later +steps in this guide can help inform how to adjust these settings. +The Rose configuation file format is described (here)[https://metomi.github.io/rose/doc/html/api/configuration/rose-configuration-format.html] - - -### 6. UPDATEME Configure your postprocessing components - -A postprocessed component, originally defined in FRE Bronx and used in Canopy, is a user-defined label that has two main -qualities: -- a single target horizontal grid: i.e. native atmosphere; native ocean; or regridded spherical (lat/lon) -- history files that should be included in the component - -FMS history files are limited to a single time dimension, so commonly, multiple history files are mapped to -a single postprocess component. In the following examples, we will create an `atmos` component as a 1x1-degree regridded -grid composed of the history files `atmos_month` and `atmos_daily`; a `land` component is regridded to a reduced 2-degree -grid, and should contain `land_month`, `land_daily`, and `land_static`. Finally, a `atmos_scalar` component should be left -on the native grid, and should contain `atmos_scalar` and `atmos_global_cmip`. - -The steps for postprocess component configuration are: -1. Set the PP components you wish to process in the `PP_COMPONENTS` in the rose-suite file(s) described above. -2. Define the history file mapping in `app/remap-pp-components/rose-app.conf` and the regridding details in -`app/regrid-xy/rose-app.conf`. -3. Use `rose macro --validate` throughout for configuration validation. - -For example, to postprocess the 3 components `atmos`, `land`, and `atmos_scalar`, set in your `opt/rose-suite-LABEL.conf` file: - + +### 4. Create history file manifest (optional but highly recommended) +For more complete validation of workflow settings, we create a manifest for our history file archives with ``` -PP_COMPONENTS="atmos land atmos_scalar" +tar -tf /path/to/history/YYYYMMDD.nc.tar | grep -v "tile[2-6]" | sort > history-manifest ``` -Then, let `rose macro --validate` advise your edits. When the validation errors go away, your configuration is valid and -consistent. After setting the `PP_COMPONENTS` above, the configuration validation will ensure configuration consistency and -completeness. +The `history-manifest` contains a list of source files contained within the targeted history files. This can be +helpful for validating settings on a component-by-component basis in the next step. -``` -rose macro --validate -[V] components.ComponentChecker: issues: 3 - (opts=TEMPLATE)template variables=PP_COMPONENTS=atmos land atmos_scalar - Requested component 'atmos' is not defined in /nbhome/c2b/git/fre2/workflows/postprocessing/meta/lib/python/macros/../../../../app/remap-pp-components/rose-app.conf - (opts=TEMPLATE)template variables=PP_COMPONENTS=atmos land atmos_scalar - Requested component 'land' is not defined in /nbhome/c2b/git/fre2/workflows/postprocessing/meta/lib/python/macros/../../../../app/remap-pp-components/rose-app.conf - (opts=TEMPLATE)template variables=PP_COMPONENTS=atmos land atmos_scalar - Requested component 'atmos_scalar' is not defined in /nbhome/c2b/git/fre2/workflows/postprocessing/meta/lib/python/macros/../../../../app/remap-pp-components/rose-app.conf -``` -To create your desired history file to postprocess component remapping, it's helpful (i.e. until history file manifests exist) to -list the contents of the history tarfile in order to create your postprocessing configuration. + +### 5. Define your desired postprocessing components for `remap-pp-components` +Users define their own postprocessing components for their workflow, which represent a group of source files (listed in your +`history-manifest`) to be post-processed together. This grouping is typically united by a common gridding, which may be the +current "native" gridding of the source files, or a desired target gridding to achieve via regridding. -``` -tar -tf /path/to/history/YYYYMMDD.nc.tar | grep -v "tile[2-6]" | sort -``` - -Each history file reported above may be included in your postprocess components. For example, here is an -`app/remap-pp-components/rose-app.conf` file: +User-defined components are configured within `app/remap-pp-components/rose-app.conf`. A possible set of components for example, +could be: ``` [atmos] sources=atmos_month atmos_daily grid=regrid-xy/default +[atmos_scalar] +sources=atmos_scalar + atmos_global_cmip +grid=native + [land] sources=land_month_cmip land_daily_cmip @@ -126,38 +99,29 @@ sources=land_static grid=regrid-xy/default freq=P0Y -[atmos_scalar] -sources=atmos_scalar - atmos_global_cmip -grid=native + ``` -Explanation / discussion: -- The Rose configuation file format is described (here)[https://metomi.github.io/rose/doc/html/api/configuration/rose-configuration-format.html] -- The header sections identify the PP components. PP components may not contain periods! Any text after a period is optional, and merely serves to allow another section for remapping. (e.g. `land` and `land.static` identify the `land` component) -- - The `grid` attribute should be either "native", "regrid-xy/default", or "regrid-xy/LABEL". The regrid-xy label (default or user-defined) are defined in the `app/regrid-xy/rose-app.conf` file, described next. -- The `freq` attribute has special meaning and is needed for static processing. If `freq` is set to `0PY`, only the static -variables will be remapped. If `freq` is unset, then all temporal frequencies are included. +Above, we've defined the `atmos` component as being a set of two source files, `atmos_month` and `atmos_daily`. The `grid` +field shows we wish to have these two source files regridded to the default resolution specified in `rose-suite.conf`. By +contrast, the `atmos_scalar` component specifies a `native` grid, indicating that `atmos_scalar` and `atmos_global_cmip` +source files will not be regridded when processing them for the `atmos_scalar` component. -After adding your entries to `app/remap-pp-components/rose-app.conf`, run `rose macro --validate` again: +Note- it is not uncommon for a specific component to be named after a source file contained in it's `sources` field, +but it does not imply anything special about the relationship between the source file and the component. -``` -rose macro --validate +The `land.static` component defined above gets interpreted as part of the `land` component, as any text after a `.` is +ignored in component names. Here, we want `land` to contain `land_month`, `land_daily`, and `land_static`, and we want all +all of them to be regridded to a 2-degree resolution gridding. The `freq=P0Y` field implies that `land_static` as a source +file is time-independent, and will only be processed if `DO_STATICS=True` in `rose-suite.conf`. -(opts=TEMPLATE)template variables=PP_COMPONENTS=atmos land atmos_scalar -Requested component 'atmos' uses history file 'atmos_daily' with regridding label 'regrid-xy/default', but this was not found in app/regrid-xy/rose-app.conf -(opts=TEMPLATE)template variables=PP_COMPONENTS=atmos land atmos_scalar -Requested component 'atmos' uses history file 'atmos_month' with regridding label 'regrid-xy/default', but this was not found in app/regrid-xy/rose-app.conf -(opts=TEMPLATE)template variables=PP_COMPONENTS=atmos land atmos_scalar -Requested component 'land' uses history file 'land_daily_cmip' with regridding label 'regrid-xy/2deg', but this was not found in app/regrid-xy/rose-app.conf -(opts=TEMPLATE)template variables=PP_COMPONENTS=atmos land atmos_scalar -Requested component 'land' uses history file 'land_month_cmip' with regridding label 'regrid-xy/2deg', but this was not found in app/regrid-xy/rose-app.conf -(opts=TEMPLATE)template variables=PP_COMPONENTS=atmos land atmos_scalar -Requested component 'land' uses history file 'land_static' with regridding label 'regrid-xy/2deg', but this was not found in app/regrid-xy/rose-app.conf -``` +The setting for `PP_COMPONENTS` should reflect information in `app/remap-pp-components/rose-app.conf`. From our example, a +good list that passes validation would be `PP_COMPONENTS=atmos atmos_scalar land`. -Now add corresponding regridding instructions for the regridding labels. This can be added to `app/regrid-xy/rose-app.conf`: + +### 6. Provide more specifics for `regrid-xy` +Add more-specific regridding instructions to `app/regrid-xy/rose-app.conf`: ``` [atmos] inputGrid=cubedsphere @@ -179,8 +143,10 @@ sources=land_month_cmip land_static ``` -Explanation / discussion: -- The header sections identify the regridding instructions for a list of history files, and do not have meaning other than being unique. +Note that the `atmos_scalar` component does not have an entry here, as we requested a `native` regridding for source files in +that component. + + - The `inputGrid` attribute should be `cubedsphere` or `tripolar`. - The `inputRealm` attribute is used for identifying the land or atmos grid mosaic file: should be `atmos`, `land`, or `ocean`. - The `interpMethod` should be `conserve_order1`, `conserve_order2`, or `bilinear`. @@ -188,6 +154,7 @@ Explanation / discussion: - If `OutputGridType` is `default`, then the `DEFAULT_XY_INTERP` setting is used. Otherwise, `OutputGridLat` and `OutputGridLon` identify the target grid. + ### 7. UPDATEME Optionally, report on history files that may be missing Generate a "history manifest" file by listing the contents of a history tarfile to a file called 'history-manifest'. ``` @@ -200,31 +167,70 @@ Probably, you should remove components that specify non-existent history files, or trust that the missing history files will be created by a refineDiag script. -### 8. UPDATEME Validate the configuration -`rose macro --validate` should report no errors. -Then, validate the Cylc configuration: -`bin/validate-exp EXPNAME` + + +### 4. Validate your workflow configuration +When you are ready, you can have rose validate your configuration to catch common problems: +``` +rose macro --validate +``` +If there are any errors, try to address them. Common errors include non-existent directories and time intervals that +do not follow ISO8601 specifications. Iterate between editing your configuration and validating with rose until +all complaints are addressed. + + + + + + + + + + + + +### 8. UPDATEME Validate/Install/Run the configured workflow templates with `cylc` +Validate the workflow configuation with `cylc validate .` Please complain (to a Canopy developer) or take a note if if the Cylc validation fails but the Rose validation passes, as this may expose some internal problems or quoting issues. + + ### 9. UPDATEME Install the workflow ``` -bin/install-exp EXPNAME +cylc install . ``` This installs the workflow run directory in `~/cylc-run//runN`, where `N` is an incrementing number (like `FRE --unique`). The various `cylc` commands act on the most recent `runN` by default. + ### 10. UPDATEME Run the workflow ``` -cylc play EXPNAME +cylc play . ``` The workflow runs a daemon on the `workflow1` server (via `ssh`, so you see the login banner). + + + + + + + + + + + + + + + + ### 11. UPDATEME Inspect workflow progress with an interface (GUI or TUI) The workflow will run and shutdown when all tasks are complete. If tasks fail, the workflow may stall, in which case it will shutdown in error after a period of time. @@ -242,6 +248,7 @@ cylc gui --ip=`hostname -f` --port=`jhp 1` --no-browser Then, navigate to one of the two links printed to screen in your web browser + ### 12. UPDATEME Inspect workflow progress with a terminal CLI Various other `cylc` commands are useful for inspecting a running workflow. Try `cylc help`. @@ -250,3 +257,15 @@ Various other `cylc` commands are useful for inspecting a running workflow. Try - `cylc cat-log EXPNAME` Show the scheduler log - `cylc list EXPNAME` Lists all tasks - `cylc report-timings EXPNAME` + + + + + + + +If `history-manifest` exists, `rose macro --validate` will check for source files referenced within `PP_COMPONENTS` but +not present in the history files. + +It's recommended to remove components that specify non-existent history files, reconfigure the component definition, +or trust that the missing history files will be created by a refineDiag script. From 7af7e60a1093ffd51427ef089ac7e1948f8e75fd Mon Sep 17 00:00:00 2001 From: Ian Date: Wed, 31 Jul 2024 16:31:52 -0400 Subject: [PATCH 09/15] Update README.md getting there... should be good for trying to use this tomorrow and conduct tweaks/fix issues with my wording/descriptions etc. --- README.md | 86 +++++++++++++++++++++---------------------------------- 1 file changed, 33 insertions(+), 53 deletions(-) diff --git a/README.md b/README.md index ec8366c..9301bd9 100644 --- a/README.md +++ b/README.md @@ -14,7 +14,7 @@ These instructions are targeted at workflow developers. If you are a user simply -### 1. Clone `fre-workflows` repository +## 1. Clone `fre-workflows` repository ``` git clone https://github.com/NOAA-GFDL/fre-workflows.git cd fre-workflows @@ -25,7 +25,7 @@ cd fre-workflows -### 2. Load Cylc, the backend workflow engine used by FRE2 +## 2. Load Cylc, the backend workflow engine used by FRE2 ``` module load cylc ``` @@ -35,9 +35,8 @@ be loaded by those jobs as part of their requirements and do not need to be load - -### 3. Fill in rose-suite configuration fields +## 3. Fill in rose-suite configuration fields With your favorite text editor, open up `rose-suite.conf` and set variables to desired values. These values will be passed to task definitions within `flow.cylc` and taken as configuration settings to instruct task execution. @@ -56,7 +55,9 @@ Key values include: It is common to not know exactly what to set `PP_COMPONENTS` to when configuring a new workflow from scratch. Later steps in this guide can help inform how to adjust these settings. -The Rose configuation file format is described (here)[https://metomi.github.io/rose/doc/html/api/configuration/rose-configuration-format.html] +The Rose configuation file format is described [here](https://metomi.github.io/rose/doc/html/api/configuration/rose-configuration-format.html) + + ### 4. Create history file manifest (optional but highly recommended) @@ -71,7 +72,7 @@ helpful for validating settings on a component-by-component basis in the next st -### 5. Define your desired postprocessing components for `remap-pp-components` +## 5. Define your desired postprocessing components for `remap-pp-components` Users define their own postprocessing components for their workflow, which represent a group of source files (listed in your `history-manifest`) to be post-processed together. This grouping is typically united by a common gridding, which may be the current "native" gridding of the source files, or a desired target gridding to achieve via regridding. @@ -98,8 +99,6 @@ grid=regrid-xy/2deg sources=land_static grid=regrid-xy/default freq=P0Y - - ``` Above, we've defined the `atmos` component as being a set of two source files, `atmos_month` and `atmos_daily`. The `grid` @@ -120,7 +119,7 @@ good list that passes validation would be `PP_COMPONENTS=atmos atmos_scalar land -### 6. Provide more specifics for `regrid-xy` +## 6. Provide more specifics for `regrid-xy` Add more-specific regridding instructions to `app/regrid-xy/rose-app.conf`: ``` [atmos] @@ -146,40 +145,31 @@ sources=land_month_cmip Note that the `atmos_scalar` component does not have an entry here, as we requested a `native` regridding for source files in that component. - +Full documentation on the available input configuration fields is available in the [`app/regrid-xy` directory](https://github.com/NOAA-GFDL/fre-workflows/tree/update.README/app/regrid-xy) +, but some things worth noting bove - The `inputGrid` attribute should be `cubedsphere` or `tripolar`. -- The `inputRealm` attribute is used for identifying the land or atmos grid mosaic file: should be `atmos`, `land`, or `ocean`. +- The `inputRealm` attribute is used for identifying the `land`, `atmos`, or `ocean` grid mosaic file. - The `interpMethod` should be `conserve_order1`, `conserve_order2`, or `bilinear`. - `OutputGridType` is the grid label referenced in the `app/remap-pp-components/rose-app.conf` file. -- If `OutputGridType` is `default`, then the `DEFAULT_XY_INTERP` setting is used. Otherwise, `OutputGridLat` and `OutputGridLon` identify the target grid. - - - -### 7. UPDATEME Optionally, report on history files that may be missing -Generate a "history manifest" file by listing the contents of a history tarfile to a file called 'history-manifest'. -``` -tar -tf /path/to/history/YYYYMMDD.nc.tar | grep -v "tile[2-6]" | sort > history-manifest -``` - -If `history-manifest` exists, `rose macro --validate` will report on history files referenced but not present. - -Probably, you should remove components that specify non-existent history files, reconfigure the component definition, -or trust that the missing history files will be created by a refineDiag script. +- if `OutputGridType` `default`, then `DEFAULT_XY_INTERP` from `rose-suite.conf` is used. +- `OutputGridLat` and `OutputGridLon` identify the target grid if `OutputGridType` is not specified -### 4. Validate your workflow configuration +## 7. Validate your workflow configuration When you are ready, you can have rose validate your configuration to catch common problems: ``` rose macro --validate ``` -If there are any errors, try to address them. Common errors include non-existent directories and time intervals that -do not follow ISO8601 specifications. Iterate between editing your configuration and validating with rose until -all complaints are addressed. - +Common errors include non-existent directories and time intervals that do not follow ISO8601 specifications. One can wait until +this step to bother validating, or it can be a back/forth iteration between editing and validating until all complaints are +addressed. If `history-manifest` exists, `rose macro --validate` will report on source files referenced by components that are +not present in the history tar file archives. +If a component specifies a non-existent source file, reconfigure the component definition or omit the component from +post-processing all together. It's also OK to remove the source file specified within that component, but @@ -190,29 +180,25 @@ all complaints are addressed. -### 8. UPDATEME Validate/Install/Run the configured workflow templates with `cylc` -Validate the workflow configuation with `cylc validate .` - -Please complain (to a Canopy developer) or take a note if if the Cylc validation fails but the Rose validation passes, -as this may expose some internal problems or quoting issues. - - +## 8. UPDATEME Validate/Install/Run the configured workflow templates with `cylc` +Validate the workflow with +``` +cylc validate . +``` - -### 9. UPDATEME Install the workflow +If the Cylc validation fails but the Rose validation passes, please raise an issue on this repository, as it is better to +catch configuration issues at the `rose macro validate` step. One then installs the workflow with: ``` cylc install . ``` -This installs the workflow run directory in `~/cylc-run//runN`, where `N` is an incrementing number (like `FRE --unique`). -The various `cylc` commands act on the most recent `runN` by default. - - -### 10. UPDATEME Run the workflow +This creates a workflow run directory in `~/cylc-run//runN`, where `N` is an integer incremented with each call to +`install`. Various `cylc` commands act on the most recent `runN` by default if a run is not specified. After successful +installation, the workflow is launched with: ``` cylc play . ``` -The workflow runs a daemon on the `workflow1` server (via `ssh`, so you see the login banner). +If on PP/AN, cylc launches a daemon on the `workflow1` server, via `ssh`, triggering the login banner to be printed. @@ -231,7 +217,7 @@ The workflow runs a daemon on the `workflow1` server (via `ssh`, so you see the -### 11. UPDATEME Inspect workflow progress with an interface (GUI or TUI) +## 11. UPDATEME Inspect workflow progress with an interface (GUI or TUI) The workflow will run and shutdown when all tasks are complete. If tasks fail, the workflow may stall, in which case it will shutdown in error after a period of time. @@ -249,7 +235,7 @@ Then, navigate to one of the two links printed to screen in your web browser -### 12. UPDATEME Inspect workflow progress with a terminal CLI +## 12. UPDATEME Inspect workflow progress with a terminal CLI Various other `cylc` commands are useful for inspecting a running workflow. Try `cylc help`. - `cylc scan` Lists running workflows @@ -263,9 +249,3 @@ Various other `cylc` commands are useful for inspecting a running workflow. Try - -If `history-manifest` exists, `rose macro --validate` will check for source files referenced within `PP_COMPONENTS` but -not present in the history files. - -It's recommended to remove components that specify non-existent history files, reconfigure the component definition, -or trust that the missing history files will be created by a refineDiag script. From 859bdf9165fea75d8d56b8c5ef09b9cd9ce6d381 Mon Sep 17 00:00:00 2001 From: Ian Date: Wed, 31 Jul 2024 17:09:26 -0400 Subject: [PATCH 10/15] Update README.md --- README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 9301bd9..521d643 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,7 @@ based on: https://gitlab.gfdl.noaa.gov/fre2/workflows/postprocessing/-/raw/d51df This repository holds code for defining tasks, applications, tools, workflows, and other aspects of the FRE2 postprocessing ecosystem. - +# UNDER CONSTRUCTION # Instructions to postprocess FMS history files on GFDL's PP/AN @@ -217,7 +217,7 @@ If on PP/AN, cylc launches a daemon on the `workflow1` server, via `ssh`, trigge -## 11. UPDATEME Inspect workflow progress with an interface (GUI or TUI) +## 9. UPDATEME Inspect workflow progress with an interface (GUI or TUI) The workflow will run and shutdown when all tasks are complete. If tasks fail, the workflow may stall, in which case it will shutdown in error after a period of time. @@ -235,7 +235,7 @@ Then, navigate to one of the two links printed to screen in your web browser -## 12. UPDATEME Inspect workflow progress with a terminal CLI +## 10. UPDATEME Inspect workflow progress with a terminal CLI Various other `cylc` commands are useful for inspecting a running workflow. Try `cylc help`. - `cylc scan` Lists running workflows From 0f56c3262efa810b76405a57f40549571eaff140 Mon Sep 17 00:00:00 2001 From: Ian Date: Wed, 31 Jul 2024 17:10:29 -0400 Subject: [PATCH 11/15] Update README.md fix subsection --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 521d643..6e50248 100644 --- a/README.md +++ b/README.md @@ -60,7 +60,7 @@ The Rose configuation file format is described [here](https://metomi.github.io/r -### 4. Create history file manifest (optional but highly recommended) +## 4. Create history file manifest (optional but highly recommended) For more complete validation of workflow settings, we create a manifest for our history file archives with ``` tar -tf /path/to/history/YYYYMMDD.nc.tar | grep -v "tile[2-6]" | sort > history-manifest From 363ff88b460a6e366311a6335263aebb1d4f77b8 Mon Sep 17 00:00:00 2001 From: Ian Date: Thu, 1 Aug 2024 12:35:19 -0400 Subject: [PATCH 12/15] Update README.md another pass --- README.md | 106 +++++++++++++++++++++++++++--------------------------- 1 file changed, 53 insertions(+), 53 deletions(-) diff --git a/README.md b/README.md index 6e50248..8195739 100644 --- a/README.md +++ b/README.md @@ -43,19 +43,19 @@ task definitions within `flow.cylc` and taken as configuration settings to instr Key values include: - `SITE` set to "ppan" to submit jobs to PP/AN cluster - `HISTORY_DIR` directory path to your raw model output -- `HISTORY_SEGMENT` duration of each history segment (ISO8601) -- `PP_CHUNK_A` duration of your desired timeseries (and timeaverages, optionally) -- `PP_START` start of the desired postprocssing (ISO8601) -- `PP_STOP` end of the desired postprocessing (ISO8601) -- `PP_COMPONENTS` space-separated list of user-defined components -- `PP_GRID_SPEC` filepath to FMS grid definition tarfile -- `DEFAULT_XY_INTERP` e.g. `"288,180"`, default target resolution for regridded data. +- `HISTORY_SEGMENT` amount of time covered by a single history file (ISO8601 datetime) +- `PP_CHUNK_A` amount of time covered by a single postprocessed file (ISO8601 datetime) +- `PP_START` start of the desired postprocessing (ISO8601 datetime) +- `PP_STOP` end of the desired postprocessing (ISO8601 datetime) +- `PP_COMPONENTS` space-separated list of user-defined components, discussed in more detail below +- `PP_GRID_SPEC` path to FMS grid definition tarfile +- `PP_DEFAULT_XYINTERP` default target resolution for regridding, if the resolution is not specified elsewhere - `FRE_ANALYSIS_HOME` For locating shared analysis scripts (only define if `DO_ANALYSIS=True`) It is common to not know exactly what to set `PP_COMPONENTS` to when configuring a new workflow from scratch. Later steps in this guide can help inform how to adjust these settings. -The Rose configuation file format is described [here](https://metomi.github.io/rose/doc/html/api/configuration/rose-configuration-format.html) +The Rose configuation file format is described in full [elsewhere](https://metomi.github.io/rose/doc/html/api/configuration/rose-configuration-format.html) @@ -67,19 +67,22 @@ tar -tf /path/to/history/YYYYMMDD.nc.tar | grep -v "tile[2-6]" | sort > history- ``` The `history-manifest` contains a list of source files contained within the targeted history files. This can be -helpful for validating settings on a component-by-component basis in the next step. +helpful for validating settings on a component-by-component basis in the next step(s). ## 5. Define your desired postprocessing components for `remap-pp-components` -Users define their own postprocessing components for their workflow, which represent a group of source files (listed in your -`history-manifest`) to be post-processed together. This grouping is typically united by a common gridding, which may be the -current "native" gridding of the source files, or a desired target gridding to achieve via regridding. +Users define their own postprocessing components for their workflow, which represent a group of source files to be postprocessed +together. This grouping is typically united by a common gridding, which may be the current "native" gridding of the source files, +or a desired target gridding to achieve via regridding. If `history-manifest` was created, it will be used to check that the +source files specified in the components are actually present in the history files. -User-defined components are configured within `app/remap-pp-components/rose-app.conf`. A possible set of components for example, -could be: +User-defined components are configured within `app/remap-pp-components/rose-app.conf`. An example set of components could be: ``` +[command] +default=remap-pp-components + [atmos] sources=atmos_month atmos_daily @@ -101,7 +104,7 @@ grid=regrid-xy/default freq=P0Y ``` -Above, we've defined the `atmos` component as being a set of two source files, `atmos_month` and `atmos_daily`. The `grid` +Here we've defined the `atmos` component as being a set of two source files, `atmos_month` and `atmos_daily`. The `grid` field shows we wish to have these two source files regridded to the default resolution specified in `rose-suite.conf`. By contrast, the `atmos_scalar` component specifies a `native` grid, indicating that `atmos_scalar` and `atmos_global_cmip` source files will not be regridded when processing them for the `atmos_scalar` component. @@ -110,7 +113,7 @@ Note- it is not uncommon for a specific component to be named after a source fil but it does not imply anything special about the relationship between the source file and the component. The `land.static` component defined above gets interpreted as part of the `land` component, as any text after a `.` is -ignored in component names. Here, we want `land` to contain `land_month`, `land_daily`, and `land_static`, and we want all +ignored in component names. Thus, the `land` component contains `land_month`, `land_daily`, and `land_static`, and we want all of them to be regridded to a 2-degree resolution gridding. The `freq=P0Y` field implies that `land_static` as a source file is time-independent, and will only be processed if `DO_STATICS=True` in `rose-suite.conf`. @@ -120,8 +123,16 @@ good list that passes validation would be `PP_COMPONENTS=atmos atmos_scalar land ## 6. Provide more specifics for `regrid-xy` -Add more-specific regridding instructions to `app/regrid-xy/rose-app.conf`: +Any component specified in `app/remap-pp-components/rose-app.conf` requesting regridding requires a corresponding entry in +`app/regrid-xy/rose-app.conf`, providing further information. A full set of options one can specify in this configuration +can be found in `app/regrid-xy/README.md`. + +Following up on our example in the previous step, we would not have an extry in `app/regrid-xy/rose-app.conf` for +`atmos_scalar`, but we will for `atmos` and `land` components: ``` +[command] +default=regrid-xy + [atmos] inputGrid=cubedsphere inputRealm=atmos @@ -141,17 +152,14 @@ sources=land_month_cmip land_daily_cmip land_static ``` - Note that the `atmos_scalar` component does not have an entry here, as we requested a `native` regridding for source files in -that component. - -Full documentation on the available input configuration fields is available in the [`app/regrid-xy` directory](https://github.com/NOAA-GFDL/fre-workflows/tree/update.README/app/regrid-xy) -, but some things worth noting bove -- The `inputGrid` attribute should be `cubedsphere` or `tripolar`. -- The `inputRealm` attribute is used for identifying the `land`, `atmos`, or `ocean` grid mosaic file. +that component. Full documentation on the available input configuration fields is available in the +[`app/regrid-xy` directory](https://github.com/NOAA-GFDL/fre-workflows/tree/update.README/app/regrid-xy), but some things worth +noting above: +- `inputGrid` can be `cubedsphere` or `tripolar`. +- `inputRealm` attribute is used for identifying the `land`, `atmos`, or `ocean` grid mosaic file. - The `interpMethod` should be `conserve_order1`, `conserve_order2`, or `bilinear`. - `OutputGridType` is the grid label referenced in the `app/remap-pp-components/rose-app.conf` file. -- if `OutputGridType` `default`, then `DEFAULT_XY_INTERP` from `rose-suite.conf` is used. - `OutputGridLat` and `OutputGridLon` identify the target grid if `OutputGridType` is not specified @@ -159,18 +167,19 @@ Full documentation on the available input configuration fields is available in t ## 7. Validate your workflow configuration -When you are ready, you can have rose validate your configuration to catch common problems: +Rose can validate the configuration by checking the field values against a list of rules defined by the devlopers of this +repository. It's crucial to note that while this list of rules is determined by the requirements of th + +One can wait until this step in this guide, or validate as they +go along at any point in the previous instructions ``` rose macro --validate ``` -Common errors include non-existent directories and time intervals that do not follow ISO8601 specifications. One can wait until -this step to bother validating, or it can be a back/forth iteration between editing and validating until all complaints are -addressed. If `history-manifest` exists, `rose macro --validate` will report on source files referenced by components that are -not present in the history tar file archives. - -If a component specifies a non-existent source file, reconfigure the component definition or omit the component from -post-processing all together. It's also OK to remove the source file specified within that component, but - +Common errors include non-existent directories and time intervals that are not ISO8601 datetimes. It is recommended to address +any/all complaints. If `history-manifest` exists, `rose macro --validate` will report on source files referenced by components +that are not present in the history tar file archives. Whether a missing file is a show-stopper or a toothless complaint is +at the complete discretion of the user. If a source file is missing, consider reconfiguring the component definition(s), remove +the source file from the component, or simply removing the component altogether. @@ -180,36 +189,27 @@ post-processing all together. It's also OK to remove the source file specified w -## 8. UPDATEME Validate/Install/Run the configured workflow templates with `cylc` -Validate the workflow with +## 8. Validate/Install/Run the configured workflow templates with `cylc` +Validate the workflow with `cylc` by entering: ``` cylc validate . ``` - If the Cylc validation fails but the Rose validation passes, please raise an issue on this repository, as it is better to -catch configuration issues at the `rose macro validate` step. One then installs the workflow with: +catch configuration issues at the `rose macro validate` step, and the validation rules can be updated to match the task +definition requirements. + +We install the workflow with: ``` cylc install . ``` +This creates a workflow directory in `~/cylc-run`. -This creates a workflow run directory in `~/cylc-run//runN`, where `N` is an integer incremented with each call to -`install`. Various `cylc` commands act on the most recent `runN` by default if a run is not specified. After successful -installation, the workflow is launched with: +After successful installation, the workflow is launched with: ``` cylc play . ``` -If on PP/AN, cylc launches a daemon on the `workflow1` server, via `ssh`, triggering the login banner to be printed. - - - - - - - - - - - +If on PP/AN, cylc launches a scheduler daemon on the `workflow1` server, via `ssh`, triggering the login banner to be printed. +This daemon submits and runs jobs based on the task dependencies defined in `flow.cylc`. From 08df08204fd6046ae110a38c7db24cf1d6f25248 Mon Sep 17 00:00:00 2001 From: Ian Date: Wed, 7 Aug 2024 13:26:33 -0400 Subject: [PATCH 13/15] Update README.md another iteration. adding in a few settings... static component maybe should be taken out, the logic seems wrong for the task dependence and i dont want to show that here. --- README.md | 134 +++++++++++++++++++++++++++++++++++++----------------- 1 file changed, 92 insertions(+), 42 deletions(-) diff --git a/README.md b/README.md index 8195739..2778e65 100644 --- a/README.md +++ b/README.md @@ -45,6 +45,7 @@ Key values include: - `HISTORY_DIR` directory path to your raw model output - `HISTORY_SEGMENT` amount of time covered by a single history file (ISO8601 datetime) - `PP_CHUNK_A` amount of time covered by a single postprocessed file (ISO8601 datetime) +- `PP_CHUNK_B` secondary chunk size for postprocessed files, if desired (ISO8601 datetime) - `PP_START` start of the desired postprocessing (ISO8601 datetime) - `PP_STOP` end of the desired postprocessing (ISO8601 datetime) - `PP_COMPONENTS` space-separated list of user-defined components, discussed in more detail below @@ -53,9 +54,50 @@ Key values include: - `FRE_ANALYSIS_HOME` For locating shared analysis scripts (only define if `DO_ANALYSIS=True`) It is common to not know exactly what to set `PP_COMPONENTS` to when configuring a new workflow from scratch. Later -steps in this guide can help inform how to adjust these settings. +steps in this guide can help inform how to adjust these settings. The Rose configuation file format is described in full +[elsewhere](https://metomi.github.io/rose/doc/html/api/configuration/rose-configuration-format.html) -The Rose configuation file format is described in full [elsewhere](https://metomi.github.io/rose/doc/html/api/configuration/rose-configuration-format.html) +If one is looking to hit the ground running at GFDL's PP/AN, copy/paste the code block here into your `rose-suite.conf`: +``` +SITE="ppan" +EXPERIMENT='FOO' +PLATFORM='BAR' +TARGET='BAZ' + +DO_STATICS=True +DO_TIMEAVGS=True +DO_ATMOS_PLEVEL_MASKING=True +DO_MDTF=False + +DO_REFINEDIAG=False +REFINEDIAG_SCRIPTS="\$CYLC_WORKFLOW_RUN_DIR/etc/refineDiag/refineDiag_atmos_cmip6.csh" + +DO_PREANALYSIS=False +PREANALYSIS_SCRIPT="\$CYLC_WORKFLOW_RUN_DIR/etc/refineDiag/refineDiag_data_stager_globalAve.csh" + +DO_ANALYSIS=False +DO_ANALYSIS_ONLY=False +FRE_ANALYSIS_HOME="/home/fms/local/opt/fre-analysis/test" +ANALYSIS_DIR='/nbhome/YOUR.USERNAME/fre/FMS2023.04_om5_20240410/ESM4.2JpiC_om5b04r1' + +CLEAN_WORK=True + +PTMP_DIR='/xtmp/$USER/ptmp' + +HISTORY_DIR='/archive/Niki.Zadeh/fre/FMS2023.04_om5_20240410/ESM4.2JpiC_om5b04r1/gfdl.ncrc5-intel23-prod-openmp/history' +HISTORY_DIR='/archive/Ian.Laflotte/fre/FMS2023.04_om5_20240410/ESM4.2JpiC_om5b04r1/gfdl.ncrc5-intel23-prod-openmp/history' +HISTORY_SEGMENT='P1Y' + +PP_DIR='/archive/YOUR.USERNAME/fre/FMS2023.04_om5_20240410/ESM4.2JpiC_om5b04r1/gfdl.ncrc5-intel23-prod-openmp/pp' +PP_CHUNK_A='P2Y' +PP_COMPONENTS='atmos atmos_scalar land land_static' +PP_START="00010101" +PP_STOP="00040101" +PP_DEFAULT_XYINTERP="360,180" +PP_GRID_SPEC='/work/Niki.Zadeh/mosaic_generation/exchange_grid_toolset/workdir/mosaic_c96om5b04v20240410.20240423.an105/mosaic_c96om5b04v20240410.20240423.an105.tar' +PP_GRID_SPEC='/work/Ian.Laflotte/mosaic_generation/exchange_grid_toolset/workdir/mosaic_c96om5b04v20240410.20240423.an105/mosaic_c96om5b04v20240410.20240423.an105.tar' + +``` @@ -85,40 +127,37 @@ default=remap-pp-components [atmos] sources=atmos_month - atmos_daily grid=regrid-xy/default [atmos_scalar] -sources=atmos_scalar - atmos_global_cmip +sources=atmos_scalar atmos_global_cmip grid=native [land] sources=land_month_cmip - land_daily_cmip -grid=regrid-xy/2deg +grid=regrid-xy/288_180.conserve_order1 -[land.static] +[land_static] sources=land_static -grid=regrid-xy/default +grid=regrid-xy/288_180.conserve_order1 freq=P0Y ``` -Here we've defined the `atmos` component as being a set of two source files, `atmos_month` and `atmos_daily`. The `grid` -field shows we wish to have these two source files regridded to the default resolution specified in `rose-suite.conf`. By -contrast, the `atmos_scalar` component specifies a `native` grid, indicating that `atmos_scalar` and `atmos_global_cmip` +Here we've defined the `atmos` component as being a set of one source file, `atmos_month`. The `grid` field shows we wish to +have these two source files regridded to the default resolution specified in `rose-suite.conf`. By contrast, the `atmos_scalar` +component contains two source files, and specifies a `native` grid. This indicates that `atmos_scalar` and `atmos_global_cmip` source files will not be regridded when processing them for the `atmos_scalar` component. -Note- it is not uncommon for a specific component to be named after a source file contained in it's `sources` field, -but it does not imply anything special about the relationship between the source file and the component. +Note- it is not uncommon for a specific component to be named after a source file contained in it's `sources` field, but it +does not imply anything special about the relationship between the source file and the component. -The `land.static` component defined above gets interpreted as part of the `land` component, as any text after a `.` is -ignored in component names. Thus, the `land` component contains `land_month`, `land_daily`, and `land_static`, and we want -all of them to be regridded to a 2-degree resolution gridding. The `freq=P0Y` field implies that `land_static` as a source -file is time-independent, and will only be processed if `DO_STATICS=True` in `rose-suite.conf`. +The third component is `land`, and will be regridded to a resolution corresponding to a 180x288 lat/lon grid, using an +interpolation scheme which is conservative to a first-order approximation. The last is the `land_static` component, and will +be similarly handled to `land`. Since `land_static` is time-independent, it will require `freq=P0Y`, as the name is not used +to determine if a component involves static data. Statics will only be processed if `DO_STATICS=True` in `rose-suite.conf`. The setting for `PP_COMPONENTS` should reflect information in `app/remap-pp-components/rose-app.conf`. From our example, a -good list that passes validation would be `PP_COMPONENTS=atmos atmos_scalar land`. +good list that passes validation would be `PP_COMPONENTS=atmos atmos_scalar land land_static`. @@ -128,7 +167,7 @@ Any component specified in `app/remap-pp-components/rose-app.conf` requesting re can be found in `app/regrid-xy/README.md`. Following up on our example in the previous step, we would not have an extry in `app/regrid-xy/rose-app.conf` for -`atmos_scalar`, but we will for `atmos` and `land` components: +`atmos_scalar`, but we will for `atmos`, `land` and `land_static` components: ``` [command] default=regrid-xy @@ -137,20 +176,28 @@ default=regrid-xy inputGrid=cubedsphere inputRealm=atmos interpMethod=conserve_order2 +outputGridLat=180 +outputGridLon=288 outputGridType=default sources=atmos_month - atmos_daily [land] inputGrid=cubedsphere inputRealm=land interpMethod=conserve_order1 -outputGridLon=144 -outputGridLat=90 -outputGridType=2deg +outputGridLat=180 +outputGridLon=288 +outputGridType=288_180.conserve_order1 sources=land_month_cmip - land_daily_cmip - land_static + +[land_static] +inputGrid=cubedsphere +inputRealm=land +interpMethod=conserve_order1 +outputGridLat=180 +outputGridLon=288 +outputGridType=288_180.conserve_order1 +sources=land_static ``` Note that the `atmos_scalar` component does not have an entry here, as we requested a `native` regridding for source files in that component. Full documentation on the available input configuration fields is available in the @@ -170,16 +217,15 @@ noting above: Rose can validate the configuration by checking the field values against a list of rules defined by the devlopers of this repository. It's crucial to note that while this list of rules is determined by the requirements of th -One can wait until this step in this guide, or validate as they -go along at any point in the previous instructions +One can wait until this step in this guide, or validate as they go along at any point in the previous instructions ``` rose macro --validate ``` Common errors include non-existent directories and time intervals that are not ISO8601 datetimes. It is recommended to address any/all complaints. If `history-manifest` exists, `rose macro --validate` will report on source files referenced by components that are not present in the history tar file archives. Whether a missing file is a show-stopper or a toothless complaint is -at the complete discretion of the user. If a source file is missing, consider reconfiguring the component definition(s), remove -the source file from the component, or simply removing the component altogether. +at the discretion of the user. If a source file is missing, consider reconfiguring the component definition(s), remove the +source file from the component, or simply removing the component altogether. @@ -195,8 +241,8 @@ Validate the workflow with `cylc` by entering: cylc validate . ``` If the Cylc validation fails but the Rose validation passes, please raise an issue on this repository, as it is better to -catch configuration issues at the `rose macro validate` step, and the validation rules can be updated to match the task -definition requirements. +catch configuration issues at the `rose macro --validate` step, and the validation rules can be updated to match the task +definition requirements. We install the workflow with: ``` @@ -206,9 +252,9 @@ This creates a workflow directory in `~/cylc-run`. After successful installation, the workflow is launched with: ``` -cylc play . +cylc play fre-workflows/run1 ``` -If on PP/AN, cylc launches a scheduler daemon on the `workflow1` server, via `ssh`, triggering the login banner to be printed. +If on PP/AN, cylc launches a scheduler daemon on a `workflow1` server, via `ssh`, triggering the login banner to be printed. This daemon submits and runs jobs based on the task dependencies defined in `flow.cylc`. @@ -224,25 +270,29 @@ it will shutdown in error after a period of time. `cylc` has two workflow viewing interfaces (full GUI and text UI), and a variety of CLI commands that can expose workflow and task information. The text-based GUI can be launched via: ``` -cylc tui EXPNAME +cylc tui fre-workflows/run1 ``` The full GUI can be launched on jhan or jhanbigmem (an107 or an201). ``` cylc gui --ip=`hostname -f` --port=`jhp 1` --no-browser ``` -Then, navigate to one of the two links printed to screen in your web browser - +Then, navigate to one of the two links printed to screen in your web browser. If one just wants a quick look at the state of +their workflow, the user-interfaces can be completely avoided by using the `workflow-state` command, two examples of which are: +``` +cylc workflow-state -v fre-workflows/run1 # show all jobs +cylc workflow-state -v fre-workflows/run1 | grep failed # show only failed ones +``` ## 10. UPDATEME Inspect workflow progress with a terminal CLI -Various other `cylc` commands are useful for inspecting a running workflow. Try `cylc help`. +Various other `cylc` commands are useful for inspecting a running workflow. Try `cylc help`, and `cylc --help` for +more information on how to use these tools to your advantage! - `cylc scan` Lists running workflows -- `cylc workflow-state EXPNAME` Lists all task states -- `cylc cat-log EXPNAME` Show the scheduler log -- `cylc list EXPNAME` Lists all tasks -- `cylc report-timings EXPNAME` +- `cylc cat-log fre-workflows/run1` Show the scheduler log +- `cylc list` Lists all tasks +- `cylc report-timings` From 33df7b18f962e7970e1531fd20af97d5ebc491ab Mon Sep 17 00:00:00 2001 From: Ian Date: Wed, 7 Aug 2024 13:50:19 -0400 Subject: [PATCH 14/15] Update README.md one more tweak, up next, one more test start-to-end --- README.md | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index 2778e65..ef8ba03 100644 --- a/README.md +++ b/README.md @@ -57,7 +57,8 @@ It is common to not know exactly what to set `PP_COMPONENTS` to when configuring steps in this guide can help inform how to adjust these settings. The Rose configuation file format is described in full [elsewhere](https://metomi.github.io/rose/doc/html/api/configuration/rose-configuration-format.html) -If one is looking to hit the ground running at GFDL's PP/AN, copy/paste the code block here into your `rose-suite.conf`: +If one is looking to hit the ground running at GFDL's PP/AN, copy/paste the code block here into your `rose-suite.conf`, replacing +`YOUR.USERNAME` with your own where applicable: ``` SITE="ppan" EXPERIMENT='FOO' @@ -84,7 +85,6 @@ CLEAN_WORK=True PTMP_DIR='/xtmp/$USER/ptmp' -HISTORY_DIR='/archive/Niki.Zadeh/fre/FMS2023.04_om5_20240410/ESM4.2JpiC_om5b04r1/gfdl.ncrc5-intel23-prod-openmp/history' HISTORY_DIR='/archive/Ian.Laflotte/fre/FMS2023.04_om5_20240410/ESM4.2JpiC_om5b04r1/gfdl.ncrc5-intel23-prod-openmp/history' HISTORY_SEGMENT='P1Y' @@ -92,11 +92,9 @@ PP_DIR='/archive/YOUR.USERNAME/fre/FMS2023.04_om5_20240410/ESM4.2JpiC_om5b04r1/g PP_CHUNK_A='P2Y' PP_COMPONENTS='atmos atmos_scalar land land_static' PP_START="00010101" -PP_STOP="00040101" +PP_STOP="00020101" PP_DEFAULT_XYINTERP="360,180" -PP_GRID_SPEC='/work/Niki.Zadeh/mosaic_generation/exchange_grid_toolset/workdir/mosaic_c96om5b04v20240410.20240423.an105/mosaic_c96om5b04v20240410.20240423.an105.tar' PP_GRID_SPEC='/work/Ian.Laflotte/mosaic_generation/exchange_grid_toolset/workdir/mosaic_c96om5b04v20240410.20240423.an105/mosaic_c96om5b04v20240410.20240423.an105.tar' - ``` From e8a18104e21528615920a5647780a6a0b685613d Mon Sep 17 00:00:00 2001 From: Ian Date: Wed, 7 Aug 2024 14:02:57 -0400 Subject: [PATCH 15/15] Update README.md --- README.md | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/README.md b/README.md index ef8ba03..8644d3c 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,6 @@ based on: https://gitlab.gfdl.noaa.gov/fre2/workflows/postprocessing/-/raw/d51df This repository holds code for defining tasks, applications, tools, workflows, and other aspects of the FRE2 postprocessing ecosystem. -# UNDER CONSTRUCTION # Instructions to postprocess FMS history files on GFDL's PP/AN @@ -26,13 +25,15 @@ cd fre-workflows ## 2. Load Cylc, the backend workflow engine used by FRE2 -``` -module load cylc -``` [`cylc`](https://cylc.github.io/cylc-doc/stable/html/) lets us parse workflow template files (`*.cylc`) and their configurations into modular, interdependent batch jobs. Tools used by those jobs (e.g. `fre-nctools` or `xarray`) should be loaded by those jobs as part of their requirements and do not need to be loaded at this time unless desired. +To run this repository's code, we need `cylc` accessible and somewhere in out `PATH` environment variable. One way is to activate +a conda environment with `cylc-flow`, `cylc-rose`, and `metomi-rose` installed. At GFDL's PP/AN, it is sufficient to do +``` +module load cylc +``` @@ -261,7 +262,7 @@ This daemon submits and runs jobs based on the task dependencies defined in `flo -## 9. UPDATEME Inspect workflow progress with an interface (GUI or TUI) +## 9. Inspect workflow progress with an interface (GUI or TUI) The workflow will run and shutdown when all tasks are complete. If tasks fail, the workflow may stall, in which case it will shutdown in error after a period of time. @@ -282,8 +283,10 @@ cylc workflow-state -v fre-workflows/run1 # show all jobs cylc workflow-state -v fre-workflows/run1 | grep failed # show only failed ones ``` + + -## 10. UPDATEME Inspect workflow progress with a terminal CLI +## 10. Inspect workflow progress with a terminal CLI Various other `cylc` commands are useful for inspecting a running workflow. Try `cylc help`, and `cylc --help` for more information on how to use these tools to your advantage!