-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Default Hypre MPI comm size (num_procs) #1210
Comments
Due to the limitation of MPI_comm_spawn() unstable behavior on the test machine (dual-xeon-16core), I switched to the hybrid OpenMP+MPI configuration and tested the Hypre IJ linear algebra interface further. Limiting OpenMP to one thread shows 4 cores running actively during the Hypre process (the original equation? How to change it?). Here is the interesting behavior (bug?): when you set OpenMP to more than 8, the result is pretty bad (or wrong, as if the errTol is set to high, using the default BoomerAMG method). If it is within 8 threads, the result is correct. Possibly when the total OpenMP threads exceed the max cores (32 cores), some wait state or OMP calls are not working properly. Can someone comment on this behavior? The goal is to control/change the actual MPI process count. thanks! |
@iaae could you provide instructions to reproduce this behavior you described? |
Thanks for the quick reply. Unfortunately, Hypre code is tested on a slightly larger code on a Windows platform with several different solvers for benchmark/evaluation purposes...for now. I plan to investigate different MPI/OpenMP (Intel, MS, etc) to investigate/confirm the behavior further. If confirmed, I will try duplicating the behavior using Ex5 (the only example with the IJ interface). It would help if anyone could confirm/comment/change the behavior of running Ex5 on Win11 using either MS_MPI or Intel I_MPI. I_MPI does not allow you to specify a world size of more than 1 on a single-CPU Win platform, but MS_MPI does. In both cases, using the default one MPI_comm_size causes Hypre to run on 4 cores (on a 32-core machine) as can be observed on the CPU/core performance plot. |
@iaae This behavior is likely related to the MPI or system environment, not Hypre itself. MS_MPI may be spawning threads or oversubscribing by default, while Intel I_MPI might handle threading differently. Try setting OMP_NUM_THREADS=1 explicitly before running to restrict threads. If possible, testing on Linux might confirm whether this is platform-specific. |
I tested the Hypre solver via the IJ linear algebra interface and am in the middle of fine-tuning it for a larger-scale test.
When Hypere is driven by a single MPI world comm size, it seems it is still using 4 processors/threads (on a 32-core dual-cpu machine). In practice, the number of processors should be controlled during the runtime, and there are a few ways to change that from the PDE solver control (e.g. spawn a hyper MPI-run, create a new comm, etc).
Before I change the code to achieve such, is there a quick way to change the default hyper 4 processor/thread control? It does not seem like there is any API for that.
Thanks!
The text was updated successfully, but these errors were encountered: