This repository holds code for defining tasks, applications, tools, workflows, and other aspects of the FRE2 postprocessing ecosystem.
These instructions are targeted at workflow developers. If you are a user simply looking to run a specific workflow, see
fre-cli
.
git clone https://github.com/NOAA-GFDL/fre-workflows.git
cd fre-workflows
Do not clone to a temporary directory - the directory must be readable by slurm from all nodes. Directories on local
\vftmp
are not, while those on /home
, /work
, and /xtmp
are.
cylc
lets us parse workflow template files (*.cylc
) and their
configurations into modular, interdependent batch jobs. Tools used by those jobs (e.g. fre-nctools
or xarray
) should
be loaded by those jobs as part of their requirements and do not need to be loaded at this time unless desired.
To run this repository's code, we need cylc
accessible and somewhere in out PATH
environment variable. One way is to activate
a conda environment with cylc-flow
, cylc-rose
, and metomi-rose
installed. At GFDL's PP/AN, it is sufficient to do
module load cylc
With your favorite text editor, open up rose-suite.conf
and set variables to desired values. These values will be passed to
task definitions within flow.cylc
and taken as configuration settings to instruct task execution.
Key values include:
SITE
set to "ppan" to submit jobs to PP/AN clusterHISTORY_DIR
directory path to your raw model outputHISTORY_SEGMENT
amount of time covered by a single history file (ISO8601 datetime)PP_CHUNK_A
amount of time covered by a single postprocessed file (ISO8601 datetime)PP_CHUNK_B
secondary chunk size for postprocessed files, if desired (ISO8601 datetime)PP_START
start of the desired postprocessing (ISO8601 datetime)PP_STOP
end of the desired postprocessing (ISO8601 datetime)PP_COMPONENTS
space-separated list of user-defined components, discussed in more detail belowPP_GRID_SPEC
path to FMS grid definition tarfilePP_DEFAULT_XYINTERP
default target resolution for regridding, if the resolution is not specified elsewhereFRE_ANALYSIS_HOME
For locating shared analysis scripts (only define ifDO_ANALYSIS=True
)
It is common to not know exactly what to set PP_COMPONENTS
to when configuring a new workflow from scratch. Later
steps in this guide can help inform how to adjust these settings. The Rose configuation file format is described in full
elsewhere
If one is looking to hit the ground running at GFDL's PP/AN, copy/paste the code block here into your rose-suite.conf
, replacing
YOUR.USERNAME
with your own where applicable:
SITE="ppan"
EXPERIMENT='FOO'
PLATFORM='BAR'
TARGET='BAZ'
DO_STATICS=True
DO_TIMEAVGS=True
DO_ATMOS_PLEVEL_MASKING=True
DO_MDTF=False
DO_REFINEDIAG=False
REFINEDIAG_SCRIPTS="\$CYLC_WORKFLOW_RUN_DIR/etc/refineDiag/refineDiag_atmos_cmip6.csh"
DO_PREANALYSIS=False
PREANALYSIS_SCRIPT="\$CYLC_WORKFLOW_RUN_DIR/etc/refineDiag/refineDiag_data_stager_globalAve.csh"
DO_ANALYSIS=False
DO_ANALYSIS_ONLY=False
FRE_ANALYSIS_HOME="/home/fms/local/opt/fre-analysis/test"
ANALYSIS_DIR='/nbhome/YOUR.USERNAME/fre/FMS2023.04_om5_20240410/ESM4.2JpiC_om5b04r1'
CLEAN_WORK=True
PTMP_DIR='/xtmp/$USER/ptmp'
HISTORY_DIR='/archive/Ian.Laflotte/fre/FMS2023.04_om5_20240410/ESM4.2JpiC_om5b04r1/gfdl.ncrc5-intel23-prod-openmp/history'
HISTORY_SEGMENT='P1Y'
PP_DIR='/archive/YOUR.USERNAME/fre/FMS2023.04_om5_20240410/ESM4.2JpiC_om5b04r1/gfdl.ncrc5-intel23-prod-openmp/pp'
PP_CHUNK_A='P2Y'
PP_COMPONENTS='atmos atmos_scalar land land_static'
PP_START="00010101"
PP_STOP="00020101"
PP_DEFAULT_XYINTERP="360,180"
PP_GRID_SPEC='/work/Ian.Laflotte/mosaic_generation/exchange_grid_toolset/workdir/mosaic_c96om5b04v20240410.20240423.an105/mosaic_c96om5b04v20240410.20240423.an105.tar'
For more complete validation of workflow settings, we create a manifest for our history file archives with
tar -tf /path/to/history/YYYYMMDD.nc.tar | grep -v "tile[2-6]" | sort > history-manifest
The history-manifest
contains a list of source files contained within the targeted history files. This can be
helpful for validating settings on a component-by-component basis in the next step(s).
Users define their own postprocessing components for their workflow, which represent a group of source files to be postprocessed
together. This grouping is typically united by a common gridding, which may be the current "native" gridding of the source files,
or a desired target gridding to achieve via regridding. If history-manifest
was created, it will be used to check that the
source files specified in the components are actually present in the history files.
User-defined components are configured within app/remap-pp-components/rose-app.conf
. An example set of components could be:
[command]
default=remap-pp-components
[atmos]
sources=atmos_month
grid=regrid-xy/default
[atmos_scalar]
sources=atmos_scalar atmos_global_cmip
grid=native
[land]
sources=land_month_cmip
grid=regrid-xy/288_180.conserve_order1
[land_static]
sources=land_static
grid=regrid-xy/288_180.conserve_order1
freq=P0Y
Here we've defined the atmos
component as being a set of one source file, atmos_month
. The grid
field shows we wish to
have these two source files regridded to the default resolution specified in rose-suite.conf
. By contrast, the atmos_scalar
component contains two source files, and specifies a native
grid. This indicates that atmos_scalar
and atmos_global_cmip
source files will not be regridded when processing them for the atmos_scalar
component.
Note- it is not uncommon for a specific component to be named after a source file contained in it's sources
field, but it
does not imply anything special about the relationship between the source file and the component.
The third component is land
, and will be regridded to a resolution corresponding to a 180x288 lat/lon grid, using an
interpolation scheme which is conservative to a first-order approximation. The last is the land_static
component, and will
be similarly handled to land
. Since land_static
is time-independent, it will require freq=P0Y
, as the name is not used
to determine if a component involves static data. Statics will only be processed if DO_STATICS=True
in rose-suite.conf
.
The setting for PP_COMPONENTS
should reflect information in app/remap-pp-components/rose-app.conf
. From our example, a
good list that passes validation would be PP_COMPONENTS=atmos atmos_scalar land land_static
.
Any component specified in app/remap-pp-components/rose-app.conf
requesting regridding requires a corresponding entry in
app/regrid-xy/rose-app.conf
, providing further information. A full set of options one can specify in this configuration
can be found in app/regrid-xy/README.md
.
Following up on our example in the previous step, we would not have an extry in app/regrid-xy/rose-app.conf
for
atmos_scalar
, but we will for atmos
, land
and land_static
components:
[command]
default=regrid-xy
[atmos]
inputGrid=cubedsphere
inputRealm=atmos
interpMethod=conserve_order2
outputGridLat=180
outputGridLon=288
outputGridType=default
sources=atmos_month
[land]
inputGrid=cubedsphere
inputRealm=land
interpMethod=conserve_order1
outputGridLat=180
outputGridLon=288
outputGridType=288_180.conserve_order1
sources=land_month_cmip
[land_static]
inputGrid=cubedsphere
inputRealm=land
interpMethod=conserve_order1
outputGridLat=180
outputGridLon=288
outputGridType=288_180.conserve_order1
sources=land_static
Note that the atmos_scalar
component does not have an entry here, as we requested a native
regridding for source files in
that component. Full documentation on the available input configuration fields is available in the
app/regrid-xy
directory, but some things worth
noting above:
inputGrid
can becubedsphere
ortripolar
.inputRealm
attribute is used for identifying theland
,atmos
, orocean
grid mosaic file.- The
interpMethod
should beconserve_order1
,conserve_order2
, orbilinear
. OutputGridType
is the grid label referenced in theapp/remap-pp-components/rose-app.conf
file.OutputGridLat
andOutputGridLon
identify the target grid ifOutputGridType
is not specified
Rose can validate the configuration by checking the field values against a list of rules defined by the devlopers of this repository. It's crucial to note that while this list of rules is determined by the requirements of th
One can wait until this step in this guide, or validate as they go along at any point in the previous instructions
rose macro --validate
Common errors include non-existent directories and time intervals that are not ISO8601 datetimes. It is recommended to address
any/all complaints. If history-manifest
exists, rose macro --validate
will report on source files referenced by components
that are not present in the history tar file archives. Whether a missing file is a show-stopper or a toothless complaint is
at the discretion of the user. If a source file is missing, consider reconfiguring the component definition(s), remove the
source file from the component, or simply removing the component altogether.
Validate the workflow with cylc
by entering:
cylc validate .
If the Cylc validation fails but the Rose validation passes, please raise an issue on this repository, as it is better to
catch configuration issues at the rose macro --validate
step, and the validation rules can be updated to match the task
definition requirements.
We install the workflow with:
cylc install .
This creates a workflow directory in ~/cylc-run
.
After successful installation, the workflow is launched with:
cylc play fre-workflows/run1
If on PP/AN, cylc launches a scheduler daemon on a workflow1
server, via ssh
, triggering the login banner to be printed.
This daemon submits and runs jobs based on the task dependencies defined in flow.cylc
.
The workflow will run and shutdown when all tasks are complete. If tasks fail, the workflow may stall, in which case it will shutdown in error after a period of time.
cylc
has two workflow viewing interfaces (full GUI and text UI), and a variety of CLI commands that can expose workflow
and task information. The text-based GUI can be launched via:
cylc tui fre-workflows/run1
The full GUI can be launched on jhan or jhanbigmem (an107 or an201).
cylc gui --ip=`hostname -f` --port=`jhp 1` --no-browser
Then, navigate to one of the two links printed to screen in your web browser. If one just wants a quick look at the state of
their workflow, the user-interfaces can be completely avoided by using the workflow-state
command, two examples of which are:
cylc workflow-state -v fre-workflows/run1 # show all jobs
cylc workflow-state -v fre-workflows/run1 | grep failed # show only failed ones
Various other cylc
commands are useful for inspecting a running workflow. Try cylc help
, and cylc <command> --help
for
more information on how to use these tools to your advantage!
cylc scan
Lists running workflowscylc cat-log fre-workflows/run1
Show the scheduler logcylc list
Lists all taskscylc report-timings