-
Notifications
You must be signed in to change notification settings - Fork 13
Home
If you need additional help after reading this document, please contact the SI Team at siteam at gmao.gsfc.nasa.gov
This document provides an overview of GEOS GCM by describing a basic structure of the code, presenting the supported computing environments where it is used, and how to obtain, compile and run it.
This page will point users to documentation designed to help them use the GEOS GCM.
The GEOS GCM is made of up of variety of gridded components linked together by an infrastructure layer called MAPL based on ESMF. The model is characterized by a "fixture" which is a base repository containing the necessary CMake and control files to build the model.
The GEOS GCM uses a Python utility called mepo to manage multiple git repositories instead of using other technologies like Git submodules. mepo
uses a YAML file that provides a list of components (and their versions) that are required for a particular configuration of GEOS GCM.
To learn more about mepo
, consult the following links:
- mepo command reference
- Suggested workflow for Feature development
- Suggested workflow for Science development
- Presentation about mepo
We HIGHLY recommend using blobless clones with mepo, especially on discover, but it is generally good advice everywhere. To do this run:
mepo config set clone.partial blobless
which will add an entry to ~/.mepoconfig
which sets it up to use blobless clones.
A "fixture" is what we call the "base" repository of the GEOS GCM (GEOSgcm).
If you clone it with git, you'll see that is it very light and there is no real source code.
Instead, it is mainly just CMake code as well as an important file: components.yaml
. This YAML file is what controls how our model is laid out and what it consists of.
The main components of GEOS are other repositories in the GEOS-ESM GitHub organization. Examples include:
These components are laid out in the source tree by the components.yaml
file in the main fixture. In this file you'll see entries like:
GEOSgcm_GridComp:
local: ./src/Components/@GEOSgcm_GridComp
remote: ../GEOSgcm_GridComp.git
tag: v1.17.3
sparse: ./config/GEOSgcm_GridComp.sparse
develop: develop
FVdycoreCubed_GridComp:
local: ./src/Components/@GEOSgcm_GridComp/GEOSagcm_GridComp/GEOSsuperdyn_GridComp/@FVdycoreCubed_GridComp
remote: ../FVdycoreCubed_GridComp.git
tag: v1.12.1
develop: develop
This file tells mepo
to clone version v1.17.3
(tag
) of GEOSgcm_GridComp from the ../GEOSgcm_GridComp.git
repository (remote
, where ../
means its URL is relative to the fixtures) and put it on disk at ./src/Components/@GEOSgcm_GridComp
(local
). For more information about mepo, see the above links or contact the GMAO SI Team.
Most users of GEOS will build and run GEOS on NASA Supercomputing resources. This is mainly due to both the availability of the libraries needed to build and run GEOS as well as the boundary conditions, emissions, etc. needed to run the model.
In general, GEOS is supported at the NASA Center for Climate Simulation (NCCS) and the NASA Advanced Supercomputing (NAS). The following links contain documentation on how to use the resources at the two centers:
GEOS does currently build Docker containers with each release, but this is still slightly unsupported and is more in a testing/experimental mode. If you need information about this, please contact the GMAO SI Team.
The instructions to obtain, build, and run the GEOS GCM can be found in the main README for the fixture and often that is the canonical place to find instructions but we will provide here
Users are recommended to configure their shell start up files as below.
Recommended .bashrc for NCCS
umask 0022
ulimit -s unlimited
# Look for the OS version and set the module path accordingly
OS_VERSION=$(grep VERSION_ID /etc/os-release | cut -d= -f2 | cut -d. -f1 | sed 's/"//g')
# Run things in this if-block only if we're in an interactive shell
if [[ $- == *i* ]]
then
# Only put module use or other module commands here
# and in the correct OS version
if [[ "$OS_VERSION" == "15" ]]
then
export LMOD_SYSTEM_NAME=SLES15
module purge
module unuse -a /discover/swdev/gmao_SIteam/modulefiles-SLES12
module use -a /discover/swdev/gmao_SIteam/modulefiles-SLES15
module load GEOSenv
else
export LMOD_SYSTEM_NAME=SLES12
module purge
module unuse -a /discover/swdev/gmao_SIteam/modulefiles-SLES15
module use -a /discover/swdev/gmao_SIteam/modulefiles-SLES12
module load GEOSenv
fi
# Add any other things you want with interactive shells here
fi
Recommended .bashrc for NAS
umask 0022
ulimit -s unlimited
# Run things in this if-block only if we're in an interactive shell
if [[ $- == *i* ]]
then
# Only put module use or other module commands here
module use -a /nobackup/gmao_SIteam/modulefiles
module load GEOSenv
# Add any other things you want with interactive shells here
fi
Recommended .bashrc for GMAO Linux Desktops (Calculon, etc.)
umask 0022
ulimit -s unlimited
# Run things in this if-block only if we're in an interactive shell
if [[ $- == *i* ]]
then
# Only put module use or other module commands here
module use -a /ford1/share/gmao_SIteam/modulefiles
module load GEOSenv
# Add any other things you want with interactive shells here
fi
Recommended .tcshrc for NCCS
umask 0022
limit stacksize unlimited
# Look for the OS version and set the module path accordingly
set OS_VERSION=`grep VERSION_ID /etc/os-release | cut -d= -f2 | cut -d. -f1 | sed 's/"//g'`
# Run things in this if-block only if we are in an interactive shell
if ($?prompt) then
# Only put module use or other module commands here
# and in the correct OS version
if ($OS_VERSION == 15) then
setenv LMOD_SYSTEM_NAME SLES15
module purge
module unuse -a /discover/swdev/gmao_SIteam/modulefiles-SLES12
module use -a /discover/swdev/gmao_SIteam/modulefiles-SLES15
module load GEOSenv
else
setenv LMOD_SYSTEM_NAME SLES12
module purge
module unuse -a /discover/swdev/gmao_SIteam/modulefiles-SLES15
module use -a /discover/swdev/gmao_SIteam/modulefiles-SLES12
module load GEOSenv
endif
# Add any other things you want with interactive shells here
endif
Recommended .tcshrc for NAS
umask 0022
limit stacksize unlimited
# Run things in this if-block only if we are in an interactive shell
if ($?prompt) then
# Only put module use or other module commands here
module use -a /nobackup/gmao_SIteam/modulefiles
module load GEOSenv
# Add any other things you want with interactive shells here
endif
Recommended .tcshrc for GMAO Linux Desktops (Calculon, etc.)
umask 0022
limit stacksize unlimited
# Run things in this if-block only if we are in an interactive shell
if ($?prompt) then
# Only put module use or other module commands here
module use -a /ford1/share/gmao_SIteam/modulefiles
module load GEOSenv
# Add any other things you want with interactive shells here
endif
In the above codes, we run umask 0022
and the limit/ulimit
calls at all times and these are safe. The umask 0022
tells the shell to make all files and directories default readable. The limit/ulimit
calls set the stacksize to unlimited
which is needed by GEOS.
Finally, you'll see an if-block, this block is used for things you'd like in bash or tcsh, but that should only be done if in an interactive shell. You never want to run any module
commands in a non-interactive shell as it can have bad side-effects on scripts, other commands, etc. We've added some lines for loading our SI Team maintained modulefiles and a GEOSenv
metamodule in this block.
In your .bashrc
or .tcshrc
or other rc file add a line in the interactive-only section (as shown above):
module use -a /discover/swdev/gmao_SIteam/modulefiles-SLES12
module use -a /discover/swdev/gmao_SIteam/modulefiles-SLES15
module use -a /nobackup/gmao_SIteam/modulefiles
On the GMAO desktops, the SI Team modulefiles should automatically be
part of running module avail
but if not, they are in:
module use -a /ford1/share/gmao_SIteam/modulefiles
You can also run this in any interactive window you have, or just re-source
your rc file. This allows you to get module files needed to correctly checkout and build the model.
At NASA centers, we maintain a module, GEOSenv
which provided more recent git
and CMake
modules as well as access to mepo
. You can get this by loading the GEOSenv
module:
module load GEOSenv
Again, as above, you should only add this to .bashrc
or .tcshrc
in the interactive-only block. Running module load
commands in shell startup files can have adverse effects otherwise.
NOTE: If you are at NCCS or NAS, you cannot build GEOSgcm in your home directory; GEOS will exceed your quota quite quickly. You must build in somewhere like $NOBACKUP
which has a larger disk quota. For more informations see:
- NCCS
- NAS
On GitHub, there are three ways to clone the model: SSH, HTTPS, or GitHub CLI.
The first two are "git protocols" which determine how git
communicates with
GitHub (either through https or ssh). The latter is a CLI that uses either ssh or
https protocol underneath.
For developers of GEOS, the SSH git protocol is recommended as it can avoid some issues if two-factor authentication (2FA) is enabled on GitHub.
If you do not yet have a GitHub account and wish to develop GEOS, you will need to sign up for one. We recommend a username that "maps" to you if possible. For example, if your NASA AUID is jdoe
try for jdoe
at GitHub. But any username will work and we highly recommend you add your full name to your profile so that it can be easily looked for when using GitHub.
Note that GEOS-ESM requires users to have two-factor authentication enabled on GitHub for security reasons.
If you are a NASA employee, you will also need to be added to the GEOS-ESM team so you can have write permissions to the GEOS-ESM repos. To obtain this, please send an email to the SI Team at siteam at gmao.gsfc.nasa.gov
with your name and GitHub username.
Note that at the moment non-NASA contributors to GEOS should use forks to contribute.
Before you use git to make changes to the model, you should make sure you have git itself set up to sign commits correctly. Please run:
git config --global user.name "First Last"
git config --global user.email "[email protected]"
Note, the email address you set above should be an address your GitHub account knows. You can follow this page to add emails to your GitHub account.
To clone the GEOSgcm code using the SSH url (starts with [email protected]
), issue the command:
git clone -b vX.Y.Z [email protected]:GEOS-ESM/GEOSgcm.git
where vX.Y.Z
is a tag from a GEOSgcm release. Note if you don't use -b
, you will get the main
branch and that can change from day-to-day.
If this is your first time using GitHub with any SSH URL, you might get this error:
Permission denied (publickey).
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
If you do see this, you need to upload an ssh key to your GitHub account. This needs to be done on any machine that you want to use the SSH URL through.
Issues have been seen when using SSH access to GitHub with the update at NAS to TOSS4. The recommended solution from NAS is to first run:
ssh-add -l
If this returns an error, please run:
eval `ssh-agent -s`
if you use bash or:
eval `ssh-agent -c`
if you use csh. This should return a process ID (PID)
Then in either case, run this:
ssh-add
which should fix the issue.
To clone the model through HTTPS you run:
git clone -b vX.Y.Z https://github.com/GEOS-ESM/GEOSgcm.git
where vX.Y.Z
is a tag from a GEOSgcm release. Note if you don't use -b
, you will get the main
branch and that can change from day-to-day.
Note that if you use the HTTPS URL and have 2FA set up on GitHub, you will need to use personal access tokens as a password.
You can also use the GitHub CLI with:
gh repo clone GEOS-ESM/GEOSgcm -- -b vX.Y.Z
where vX.Y.Z
is a tag from a GEOSgcm release. Note if you don't use -b
, you will get the main
branch and that can change from day-to-day.
Note that when you first use gh
, it will ask what your preferred git protocol
is (https or ssh) to use "underneath". The caveats above will apply to whichever
you choose.
Regardless of the cloning method you use, you will get the directory GEOSgcm/
.
To build the model, you need to first go to the GEOSgcm/
directory:
cd GEOSgcm
From the head node, run the parallel_build.csh
script:
./parallel_build.csh -mil
As you passed in the -mil
option, this will build on the SLES15 Milan nodes at NCCS (which is recommended). This command will checkout all the external repositories of the model and build it. When done, the resulting model build will be found in build-SLES15/
and the installation will be found in install-SLES15/
with setup scripts like gcm_setup
and fvsetup
in install-SLES15/bin
.
Note that at NCCS there are currently two OSs, SLES12 and SLES15. Because of this, if you build on SLES12, you must run on SLES12, and the same for SLES15. So, if you want to run on SLES12 compute nodes (i.e., Cascade Lakes), you should to pass in -cas
to parallel_build.csh
:
./parallel_build.csh -cas
You will end up with install-SLES12/
and build-SLES12/
directories.
parallel_build.csh
provides a special flag for checking out the
development branches of GEOSgcm_GridComp, GEOSgcm_App, GMAO_Shared, and GEOS_Util. If you run:
./parallel_build.csh -mil -develop
then mepo
will run:
mepo develop GEOSgcm_GridComp GEOSgcm_App GMAO_Shared GEOS_Util
To obtain a debug version, you can run
./parallel_build.csh -mil -debug
which will build with debugging flags. This will build in build-Debug-SLES15/
and install into install-Debug-SLES15/
.
Note that running with parallel_build.csh
will create and install a tarfile of the source code at build time. If you wish to avoid
this, run parallel_build.csh
with the -no-tar
option:
./parallel_build.csh -no-tar ...
While parallel_build.csh
has many options, it does not cover all possible CMake options possible in GEOSgcm. If you wish to
pass additional CMake options to parallel_build.csh
, you can do so by using --
and then the CMake options. Note that anything
after the --
will be interpreted as a CMake option, which could lead to build issues if not careful.
For example, if you want to build a develop Debug build on Cascade Lake while turning on StratChem reduced mechanism and the CODATA 2018 options:
parallel_build.csh -develop -debug -mil -- -DSTRATCHEM_REDUCED_MECHANISM=ON -DUSE_CODATA_2018_CONSTANTS=ON
As noted above all the "regular" parallel_build.csh
options must be listed before the --
flag.
The steps detailed below are essentially those that parallel_build.csh
performs for you. Either method should yield identical executable.
NOTE: To build and run on SLES15 at NCCS, you will want to do all these steps either on discover-mil
or on an interactive Milan node.
The GEOS GCM is comprised of a set of sub-repositories.
These are managed by a tool called mepo.
To clone all the sub-repos, you can run mepo clone
inside the fixture:
cd GEOSgcm
mepo clone
This command initializes the multi-repository and clones, and assembles all the sub-repositories according to
components.yaml
To get development branches of GEOSgcm_GridComp, GEOSgcm_App, GMAO_Shared, and GEOS_Util
(a la the -develop
flag for parallel_build.csh
, one needs to run the
equivalent mepo
command. As mepo itself knows (via components.yaml
) what the development branch of each
subrepository is, the equivalent of -develop
for mepo
is to
checkout the development branches of GEOSgcm_GridComp, GEOSgcm_App, GMAO_Shared, and GEOS_Util:
mepo develop GEOSgcm_GridComp GEOSgcm_App GMAO_Shared GEOS_Util
This must be done after mepo clone
as it is running a git command in
each sub-repository.
On tcsh:
source @env/g5_modules
or on bash:
source @env/g5_modules.sh
We currently do not allow in-source builds of GEOSgcm. So we must make a directory:
mkdir build
The advantages of this is that you can build both a Debug and Release version with the same clone if desired.
CMake generates the Makefiles needed to build the model.
cd build
cmake .. -DBASEDIR=$BASEDIR/Linux -DCMAKE_Fortran_COMPILER=ifort -DCMAKE_INSTALL_PREFIX=../install
This will install to the directory install/
() that is parallel to your build/
directory. If you prefer to install elsewhere change the path in:
-DCMAKE_INSTALL_PREFIX=<path>
and CMake will install there.
To obtain a debug version, you can should add:
-DCMAKE_BUILD_TYPE=Debug
which will build with debugging flags.
Note that running with parallel_build.csh
will create and install a tarfile of the source code at build time. But if CMake is run by hand, this is not the default action (as many who build with CMake by hand are developers and not often running experiments). In order to enable this at install time, add:
-DINSTALL_SOURCE_TARFILE=ON
to your CMake command.
make -jN install
where N
is the number of parallel processes. On discover head nodes, this should only be as high as 2 due to limits on the head nodes. On a compute node, you can set N
has high as you like, though 8-12 is about the limit of parallelism in our model's make system.
For more information about getting interactive nodes see:
Once the model has built successfully, you will have an install/
directory in your checkout. To run gcm_setup
go to the install/bin/
directory and run it there:
cd install/bin
./gcm_setup
You can find more information to run the model in either atmosphere/data ocean mode (aka AMIP):
or coupled atmosphere/ocean mode: