A compilation of compiling issues #225
Replies: 8 comments
-
PR #229 may resolve the first two issues. |
Beta Was this translation helpful? Give feedback.
-
Issue #244 detected several build issues:
The first issue is hard to characterize but the second should be resolvable. |
Beta Was this translation helpful? Give feedback.
-
This commit will address the potential absence of setjmp-like POSIX calls: but this will create a conflict with a similar patch the NCAR repo, so we will resolve this after a future NCAR merge to main. |
Beta Was this translation helpful? Give feedback.
-
#326 addresses the Python executable name issue. |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
A setting in I saw this when trying to get GPU compilation working with Nvidia compilers. |
Beta Was this translation helpful? Give feedback.
-
MPI detection needs to be overhauled. The autoconf archive macro for MPI is either not able to catch many issues, or we are using it incorrectly. I am currently lacking details, all I can say at the moment is that tests are "passing" when MPI is unavailable. This is acutely happening in MacOS environments where MPI is not installed. There are similar issues requiring too much maintenance of FC and MPIFC, which autoconf ought to resolve on its own. This whole areas needs to be more rigorously investigated and fixed. |
Beta Was this translation helpful? Give feedback.
-
If GCC is not installed on a MacOS system, then it leans into the BSD defaults The issue is that the standard autoconf macros will often default to these when everything is missing. Because of the partial support, the resulting error messages are often confusing. |
Beta Was this translation helpful? Give feedback.
-
This is a running list of known issues reported by users attempting to build MOM6 with autoconf. (Checked boxes have been investigated and resolved, to the best of our ability.)
netcdf and netcdff in different directoriesIf netcdf and netcdf-fortran are installed in different directories, then some users report that either one or the other is not found. This could be an incorrect usage of
n[cf]-config
in our configure script. Needs more testing.Use all-L
flags fromnf-config
This may be the actual cause of the issue above. In any case,
configure.ac
needs to be modified to gather more than the first argument.Python dependencyMoving from
mkmf
tomakedep
means that Python is now a build dependency, which we do not test. This is a notable issue on Windows Linux (WSL) whose system python ispython3
.Unicode support inmakedep
Certain versions of Python cannot handle unicode characters in
makedep
. We have a few unicode characters (Unicode characters in source? #245) which need to be considered. We can either forbid unicode in the source, or modifymakedep
to support them.HDF5 detection
We use
n[cf]-config
to set-L
flags inLDFLAGS
, but rely on autoconf'sAC_CHECK_LIB
to append toLIBS
. This is not sufficient on systems which require explicit links to HDF5 libraries.We may be able to resolve this by also extracting
-l
flags fromn[cf]-config
.AR
/LD
configurationWe rely on
PATH
to findar
andld
, needed to build and linklibfms.a
. This was a problem on MacOS environments with Conda and Homebrew installed, including M1 systems. Conda installs its ownar
andld
tools while Homebrew uses system tools. There is also potential conflict with XCode tools.The solution here may be to lean more heavily into
automake
andlibtool
, which we have been so far reluctant to fully embrace.Resolving missingsigjmp
,longjmp
,sigsetjmp
symbolsThere are some very old systems still in use whose C libraries did not have symbols for
longjmp
, causing compilation to abort. We take steps to handle a missingsiglongjmp
but not the other three, nor any other POSIX system calls.The process would presumably be similar, but I would like to identify systems where these are missing before addressing the issue.
Unresolved symbolic links in source
makedep
does not appear to exit gracefully when encountering an unresolved symbolic link. There also appears to be some issue in the generatedMakefile
, which tries to runmakedep
, even when runningmake clean
.This is going to be a common problem, happening every time one does
git clone
and forgets the submodules.Poor resolution of
gettid
inside of Conda environments.There is some issue with linking to
gettid
when compiling inside of a conda environment.I believe that Conda has its own
ld
, and maybe even its own libc, which either does or does not get used consistently with the rest of the compilation, causing this problem.The current advice is to not use Conda when building MOM6. But ideally we should resolve all these little issues inside of autoconf.
(Also, this is likely strongly linked to the
AR
/LD
issue raised above.)Outdated git: Lack of
-C
directory supportIf one's git is out of date (v1.8 or older?) then it won't support the in-place directory switch
-C
flag.The bigger issue here is if one starts the build without
-C
, swaps their git version, and then resumes the build. I have observed that the FMS branch will not have been swapped toFMS_COMMIT
, and unexpected problems will arise.Currently, this is often an error due to poor FMS2 support. But it could be a bigger problem when it silently works and is not what the user expects.
I don't know the solution here, but it deserves some thought. Either test for
git
before starting, or just don't use-C
.Shared linker flags across
FC
andCC
when compilers are used as linkers.If one is using compilers as linkers, and adds a flag to
LDFLAGS
meant for a compiler, then things can go very wrong if that flag appears in a different compiler.I am gathering feedback from the recent MOM6 workshop, but I will continue to update it as more issues are reported and/or resolved.
Too often these issues get resolved in some unconventional way, and we lose the chance to troubleshoot and fix them. We need systems which can reliably reproduce these errors.
Beta Was this translation helpful? Give feedback.
All reactions