Skip to content
Matt Thompson edited this page Dec 18, 2024 · 42 revisions

If you need additional help after reading this document, please contact the SI Team at siteam at gmao.gsfc.nasa.gov


This document provides an overview of GEOS GCM by describing a basic structure of the code, presenting the supported computing environments where it is used, and how to obtain, compile and run it.

This page will point users to documentation designed to help them use the GEOS GCM.

1 Generic Information

1.1 Structure of GEOS

The GEOS GCM is made of up of variety of gridded components linked together by an infrastructure layer called MAPL based on ESMF. The model is characterized by a "fixture" which is a base repository containing the necessary CMake and control files to build the model.

1.2 mepo

The GEOS GCM uses a Python utility called mepo to manage multiple git repositories instead of using other technologies like Git submodules. mepo uses a YAML file that provides a list of components (and their versions) that are required for a particular configuration of GEOS GCM.

To learn more about mepo, consult the following links:

Recommended mepo settings for GEOSgcm

We HIGHLY recommend using blobless clones with mepo, especially on discover, but it is generally good advice everywhere. To do this run:

mepo config set clone.partial blobless

which will add an entry to ~/.mepoconfig which sets it up to use blobless clones.

1.3 Fixtures

A "fixture" is what we call the "base" repository of the GEOS GCM (GEOSgcm).

If you clone it with git, you'll see that is it very light and there is no real source code. Instead, it is mainly just CMake code as well as an important file: components.yaml. This YAML file is what controls how our model is laid out and what it consists of.

1.4 Components

The main components of GEOS are other repositories in the GEOS-ESM GitHub organization. Examples include:

These components are laid out in the source tree by the components.yaml file in the main fixture. In this file you'll see entries like:

GEOSgcm_GridComp:
  local: ./src/Components/@GEOSgcm_GridComp
  remote: ../GEOSgcm_GridComp.git
  tag: v1.17.3
  sparse: ./config/GEOSgcm_GridComp.sparse
  develop: develop

FVdycoreCubed_GridComp:
  local: ./src/Components/@GEOSgcm_GridComp/GEOSagcm_GridComp/GEOSsuperdyn_GridComp/@FVdycoreCubed_GridComp
  remote: ../FVdycoreCubed_GridComp.git
  tag: v1.12.1
  develop: develop

This file tells mepo to clone version v1.17.3 (tag) of GEOSgcm_GridComp from the ../GEOSgcm_GridComp.git repository (remote, where ../ means its URL is relative to the fixtures) and put it on disk at ./src/Components/@GEOSgcm_GridComp (local). For more information about mepo, see the above links or contact the GMAO SI Team.

2 Computing Centers

Most users of GEOS will build and run GEOS on NASA Supercomputing resources. This is mainly due to both the availability of the libraries needed to build and run GEOS as well as the boundary conditions, emissions, etc. needed to run the model.

In general, GEOS is supported at the NASA Center for Climate Simulation (NCCS) and the NASA Advanced Supercomputing (NAS). The following links contain documentation on how to use the resources at the two centers:

Containerized GEOS

GEOS does currently build Docker containers with each release, but this is still slightly unsupported and is more in a testing/experimental mode. If you need information about this, please contact the GMAO SI Team.

3 Working with the GEOS GCM

The instructions to obtain, build, and run the GEOS GCM can be found in the main README for the fixture and often that is the canonical place to find instructions but we will provide here

3.1 Preliminary Steps

3.1.1 Shell configuration

Users are recommended to configure their shell start up files as below.

Recommended .bashrc for NCCS
umask 0022
ulimit -s unlimited

# Look for the OS version and set the module path accordingly
OS_VERSION=$(grep VERSION_ID /etc/os-release | cut -d= -f2 | cut -d. -f1 | sed 's/"//g')

# Run things in this if-block only if we're in an interactive shell
if [[ $- == *i* ]]
then

   # Only put module use or other module commands here
   # and in the correct OS version
   if [[ "$OS_VERSION" == "15" ]]
   then
      export LMOD_SYSTEM_NAME=SLES15
      module purge
      module unuse -a /discover/swdev/gmao_SIteam/modulefiles-SLES12
      module use -a /discover/swdev/gmao_SIteam/modulefiles-SLES15
      module load GEOSenv
   else
      export LMOD_SYSTEM_NAME=SLES12
      module purge
      module unuse -a /discover/swdev/gmao_SIteam/modulefiles-SLES15
      module use -a /discover/swdev/gmao_SIteam/modulefiles-SLES12
      module load GEOSenv
   fi

   # Add any other things you want with interactive shells here

fi
Recommended .bashrc for NAS
umask 0022
ulimit -s unlimited

# Run things in this if-block only if we're in an interactive shell
if [[ $- == *i* ]]
then

   # Only put module use or other module commands here

   module use -a /nobackup/gmao_SIteam/modulefiles
   module load GEOSenv

   # Add any other things you want with interactive shells here

fi
Recommended .bashrc for GMAO Linux Desktops (Calculon, etc.)
umask 0022
ulimit -s unlimited

# Run things in this if-block only if we're in an interactive shell
if [[ $- == *i* ]]
then

   # Only put module use or other module commands here

   module use -a /ford1/share/gmao_SIteam/modulefiles
   module load GEOSenv

   # Add any other things you want with interactive shells here

fi
Recommended .tcshrc for NCCS
umask 0022
limit stacksize unlimited

# Look for the OS version and set the module path accordingly
set OS_VERSION=`grep VERSION_ID /etc/os-release | cut -d= -f2 | cut -d. -f1 | sed 's/"//g'`

# Run things in this if-block only if we are in an interactive shell
if ($?prompt) then

   # Only put module use or other module commands here
   # and in the correct OS version
   if ($OS_VERSION == 15) then
      setenv LMOD_SYSTEM_NAME SLES15
      module purge
      module unuse -a /discover/swdev/gmao_SIteam/modulefiles-SLES12
      module use -a /discover/swdev/gmao_SIteam/modulefiles-SLES15
      module load GEOSenv
   else
      setenv LMOD_SYSTEM_NAME SLES12
      module purge
      module unuse -a /discover/swdev/gmao_SIteam/modulefiles-SLES15
      module use -a /discover/swdev/gmao_SIteam/modulefiles-SLES12
      module load GEOSenv
   endif

   # Add any other things you want with interactive shells here

endif
Recommended .tcshrc for NAS
umask 0022
limit stacksize unlimited

# Run things in this if-block only if we are in an interactive shell
if ($?prompt) then

   # Only put module use or other module commands here

   module use -a /nobackup/gmao_SIteam/modulefiles
   module load GEOSenv

   # Add any other things you want with interactive shells here

endif
Recommended .tcshrc for GMAO Linux Desktops (Calculon, etc.)
umask 0022
limit stacksize unlimited

# Run things in this if-block only if we are in an interactive shell
if ($?prompt) then

   # Only put module use or other module commands here

   module use -a /ford1/share/gmao_SIteam/modulefiles
   module load GEOSenv

   # Add any other things you want with interactive shells here

endif

In the above codes, we run umask 0022 and the limit/ulimit calls at all times and these are safe. The umask 0022 tells the shell to make all files and directories default readable. The limit/ulimit calls set the stacksize to unlimited which is needed by GEOS.

Finally, you'll see an if-block, this block is used for things you'd like in bash or tcsh, but that should only be done if in an interactive shell. You never want to run any module commands in a non-interactive shell as it can have bad side-effects on scripts, other commands, etc. We've added some lines for loading our SI Team maintained modulefiles and a GEOSenv metamodule in this block.

3.1.3 Load Build Modules (at NASA or other systems)

In your .bashrc or .tcshrc or other rc file add a line in the interactive-only section (as shown above):

NCCS (SLES12)
module use -a /discover/swdev/gmao_SIteam/modulefiles-SLES12
NCCS (SLES15)
module use -a /discover/swdev/gmao_SIteam/modulefiles-SLES15
NAS
module use -a /nobackup/gmao_SIteam/modulefiles
GMAO Desktops

On the GMAO desktops, the SI Team modulefiles should automatically be part of running module avail but if not, they are in:

module use -a /ford1/share/gmao_SIteam/modulefiles

You can also run this in any interactive window you have, or just re-source your rc file. This allows you to get module files needed to correctly checkout and build the model.

GEOSenv

At NASA centers, we maintain a module, GEOSenv which provided more recent git and CMake modules as well as access to mepo. You can get this by loading the GEOSenv module:

module load GEOSenv

Again, as above, you should only add this to .bashrc or .tcshrc in the interactive-only block. Running module load commands in shell startup files can have adverse effects otherwise.

3.2 Obtaining (Cloning) the Model


NOTE: If you are at NCCS or NAS, you cannot build GEOSgcm in your home directory; GEOS will exceed your quota quite quickly. You must build in somewhere like $NOBACKUP which has a larger disk quota. For more informations see:


On GitHub, there are three ways to clone the model: SSH, HTTPS, or GitHub CLI. The first two are "git protocols" which determine how git communicates with GitHub (either through https or ssh). The latter is a CLI that uses either ssh or https protocol underneath.

For developers of GEOS, the SSH git protocol is recommended as it can avoid some issues if two-factor authentication (2FA) is enabled on GitHub.

Obtaining a GitHub Account

If you do not yet have a GitHub account and wish to develop GEOS, you will need to sign up for one. We recommend a username that "maps" to you if possible. For example, if your NASA AUID is jdoe try for jdoe at GitHub. But any username will work and we highly recommend you add your full name to your profile so that it can be easily looked for when using GitHub.

Note that GEOS-ESM requires users to have two-factor authentication enabled on GitHub for security reasons.

Being added to GEOS-ESM (NASA only)

If you are a NASA employee, you will also need to be added to the GEOS-ESM team so you can have write permissions to the GEOS-ESM repos. To obtain this, please send an email to the SI Team at siteam at gmao.gsfc.nasa.gov with your name and GitHub username.

Note that at the moment non-NASA contributors to GEOS should use forks to contribute.

Git Configuration

Before you use git to make changes to the model, you should make sure you have git itself set up to sign commits correctly. Please run:

git config --global user.name "First Last"
git config --global user.email "[email protected]"

Note, the email address you set above should be an address your GitHub account knows. You can follow this page to add emails to your GitHub account.

SSH

To clone the GEOSgcm code using the SSH url (starts with [email protected]), issue the command:

git clone -b vX.Y.Z [email protected]:GEOS-ESM/GEOSgcm.git

where vX.Y.Z is a tag from a GEOSgcm release. Note if you don't use -b, you will get the main branch and that can change from day-to-day.

Permission denied (publickey)

If this is your first time using GitHub with any SSH URL, you might get this error:

Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

If you do see this, you need to upload an ssh key to your GitHub account. This needs to be done on any machine that you want to use the SSH URL through.

Permission denied (publickey)...but it was working before (NAS)

Issues have been seen when using SSH access to GitHub with the update at NAS to TOSS4. The recommended solution from NAS is to first run:

ssh-add -l

If this returns an error, please run:

eval `ssh-agent -s`

if you use bash or:

eval `ssh-agent -c`

if you use csh. This should return a process ID (PID)

Then in either case, run this:

ssh-add

which should fix the issue.

HTTPS

To clone the model through HTTPS you run:

git clone -b vX.Y.Z https://github.com/GEOS-ESM/GEOSgcm.git

where vX.Y.Z is a tag from a GEOSgcm release. Note if you don't use -b, you will get the main branch and that can change from day-to-day.

Note that if you use the HTTPS URL and have 2FA set up on GitHub, you will need to use personal access tokens as a password.

GitHub CLI

You can also use the GitHub CLI with:

gh repo clone GEOS-ESM/GEOSgcm -- -b vX.Y.Z

where vX.Y.Z is a tag from a GEOSgcm release. Note if you don't use -b, you will get the main branch and that can change from day-to-day.

Note that when you first use gh, it will ask what your preferred git protocol is (https or ssh) to use "underneath". The caveats above will apply to whichever you choose.

Regardless of the cloning method you use, you will get the directory GEOSgcm/.

3.3 Building the Code

To build the model, you need to first go to the GEOSgcm/ directory:

cd GEOSgcm

3.3.1 Single Step Building of the Model

From the head node, run the parallel_build.csh script:

./parallel_build.csh -mil

As you passed in the -mil option, this will build on the SLES15 Milan nodes at NCCS (which is recommended). This command will checkout all the external repositories of the model and build it. When done, the resulting model build will be found in build-SLES15/ and the installation will be found in install-SLES15/ with setup scripts like gcm_setup and fvsetup in install-SLES15/bin.

Build for SLES12 (NCCS)

Note that at NCCS there are currently two OSs, SLES12 and SLES15. Because of this, if you build on SLES12, you must run on SLES12, and the same for SLES15. So, if you want to run on SLES12 compute nodes (i.e., Cascade Lakes), you should to pass in -cas to parallel_build.csh:

./parallel_build.csh -cas

You will end up with install-SLES12/ and build-SLES12/ directories.

Develop Version of GEOS GCM

parallel_build.csh provides a special flag for checking out the development branches of GEOSgcm_GridComp, GEOSgcm_App, GMAO_Shared, and GEOS_Util. If you run:

./parallel_build.csh -mil -develop

then mepo will run:

mepo develop GEOSgcm_GridComp GEOSgcm_App GMAO_Shared GEOS_Util
Debug Version of GEOS GCM

To obtain a debug version, you can run

./parallel_build.csh -mil -debug

which will build with debugging flags. This will build in build-Debug-SLES15/ and install into install-Debug-SLES15/.

Do not create and install source tarfile with parallel_build

Note that running with parallel_build.csh will create and install a tarfile of the source code at build time. If you wish to avoid this, run parallel_build.csh with the -no-tar option:

./parallel_build.csh -no-tar ...
Passing additional CMake options to parallel_build.csh

While parallel_build.csh has many options, it does not cover all possible CMake options possible in GEOSgcm. If you wish to pass additional CMake options to parallel_build.csh, you can do so by using -- and then the CMake options. Note that anything after the -- will be interpreted as a CMake option, which could lead to build issues if not careful.

For example, if you want to build a develop Debug build on Cascade Lake while turning on StratChem reduced mechanism and the CODATA 2018 options:

parallel_build.csh -develop -debug -mil -- -DSTRATCHEM_REDUCED_MECHANISM=ON -DUSE_CODATA_2018_CONSTANTS=ON

As noted above all the "regular" parallel_build.csh options must be listed before the -- flag.

3.3.2 Manual Build - Multiple Steps for Building the Model

The steps detailed below are essentially those that parallel_build.csh performs for you. Either method should yield identical executable.

NOTE: To build and run on SLES15 at NCCS, you will want to do all these steps either on discover-mil or on an interactive Milan node.

Mepo

The GEOS GCM is comprised of a set of sub-repositories. These are managed by a tool called mepo. To clone all the sub-repos, you can run mepo clone inside the fixture:

cd GEOSgcm
mepo clone

This command initializes the multi-repository and clones, and assembles all the sub-repositories according to components.yaml

Checking out develop branches of GEOSgcm_GridComp, GEOSgcm_App, GMAO_Shared, and GEOS_Util

To get development branches of GEOSgcm_GridComp, GEOSgcm_App, GMAO_Shared, and GEOS_Util (a la the -develop flag for parallel_build.csh, one needs to run the equivalent mepo command. As mepo itself knows (via components.yaml) what the development branch of each subrepository is, the equivalent of -develop for mepo is to checkout the development branches of GEOSgcm_GridComp, GEOSgcm_App, GMAO_Shared, and GEOS_Util:

mepo develop GEOSgcm_GridComp GEOSgcm_App GMAO_Shared GEOS_Util

This must be done after mepo clone as it is running a git command in each sub-repository.

Load Compiler, MPI Stack, and Baselibs

On tcsh:

source @env/g5_modules

or on bash:

source @env/g5_modules.sh
Create Build Directory

We currently do not allow in-source builds of GEOSgcm. So we must make a directory:

mkdir build

The advantages of this is that you can build both a Debug and Release version with the same clone if desired.

Run CMake

CMake generates the Makefiles needed to build the model.

cd build
cmake .. -DBASEDIR=$BASEDIR/Linux -DCMAKE_Fortran_COMPILER=ifort -DCMAKE_INSTALL_PREFIX=../install

This will install to the directory install/ () that is parallel to your build/ directory. If you prefer to install elsewhere change the path in:

-DCMAKE_INSTALL_PREFIX=<path>

and CMake will install there.

Debug Version of GEOS GCM

To obtain a debug version, you can should add:

-DCMAKE_BUILD_TYPE=Debug

which will build with debugging flags.

Create and install source tarfile

Note that running with parallel_build.csh will create and install a tarfile of the source code at build time. But if CMake is run by hand, this is not the default action (as many who build with CMake by hand are developers and not often running experiments). In order to enable this at install time, add:

-DINSTALL_SOURCE_TARFILE=ON

to your CMake command.

Build and Install with Make
make -jN install

where N is the number of parallel processes. On discover head nodes, this should only be as high as 2 due to limits on the head nodes. On a compute node, you can set N has high as you like, though 8-12 is about the limit of parallelism in our model's make system.

For more information about getting interactive nodes see:

3.4 Running the Model

Once the model has built successfully, you will have an install/ directory in your checkout. To run gcm_setup go to the install/bin/ directory and run it there:

cd install/bin
./gcm_setup

You can find more information to run the model in either atmosphere/data ocean mode (aka AMIP):

or coupled atmosphere/ocean mode:

Clone this wiki locally