From 2fdfa8ca1bff34bd85a361493230e26fce6f3697 Mon Sep 17 00:00:00 2001 From: Robert McLay Date: Mon, 17 Jun 2024 10:13:35 -0600 Subject: [PATCH] add FAQ about HPE/Cray and collections and why --- docs/source/040_FAQ.rst | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/docs/source/040_FAQ.rst b/docs/source/040_FAQ.rst index 931d6b75c..99db330e0 100644 --- a/docs/source/040_FAQ.rst +++ b/docs/source/040_FAQ.rst @@ -409,4 +409,19 @@ Why don't collections work on HPE/Cray systems? that the module load() inside a modulefile are ignored for very complicated reasons. Instead Lmod loads all modules listed in the collection. This works well except when the list of modules is - different. + different on different nodes. + + The reason why Lmod loads the list of modules in the collections + and ignores load() type functions in the modulefiles is complex. + The problem is when two or more modulefiles share the same + environment variable. Suppose that your site sets the variable + **MPI_HOME** (using *setenv()*) in each mpi modulefile. If Lmod + obeyed the load() function in each modulefile, then it would have + to delete the extra modules not in the collection. In the case + where the user switched mpi modules, Lmod would load both mpi + modules then delete one to match the list of names in the + collection. Unloading the second module would unset **MPI_HOME** + and leave this variable with no value. If a site depended on + **MPI_HOME** as part of mpi program startup script, then those + users would not be able to submit mpi programs. +