-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Monstrous computational resources required to run sfc_climo_gen #974
Comments
Which external file is being held in memory on each MPI task? |
I was talking about the climatological datasets located in fix/sfc_climo. For example, the size of snowfree_albedo.4comp.0.05.nc is 4.7 GB. When multiplied by 30, the total size would be approximately 150 GB. |
The climo datasets are read in on one MPI task, then a subsection is scattered to all tasks. Here, the array that holds climo data is only allocated on task '0':
Then, the climo data is read in on task '0', then chopped up and scattered to all tasks: UFS_UTILS/sorc/sfc_climo_gen.fd/interp.F90 Line 108 in 47705d5 |
You are right. My initial speculation about the substantial memory usage was wrong. I did a memory usage profiling for a global C48 grid with different numbers of MPI tasks, ranging from 30 to 60. The memory usage remained around 175GB regardless of the number of MPI tasks. Therefore, the memory issue must be caused by something else. |
How are you configuring the run? Can I see the fort.41 namelist? |
&config |
I see you are using the 30-sec soil and vegetation type datasets. They are quite large. There are lower-res versions of the soil and veg data. Can you use those?
|
In this case, the memory usage decreases to 33GB, which is somewhat manageable for non-HPC systems. However, 30-sec datasets should not have such a large memory footprint. Assuming a single-precision floating-point variable, the array storing the entire 30-sec dataset should be merely 3.5 GB (21600 * 43200 * 4 bytes). The overhead of sfc_climo_gen seems excessively high. |
I would guess the ESMF regridding is using a lot of memory. I can contact the ESMF team and provide them your test case. They may have suggestions to reduce the memory requirements. |
Sounds good. Thank you for working on this. I also found that when using lower-res soil and veg data, sfc_climo_gen can run with just 6 MPI tasks. It appears that higher resolution datasets require more MPI tasks, which could be a limitation also related to ESMF. |
For sfc_climo_gen, 30 MPI processes seem to be the minimum requirement and the memory footprint is at least 150 GB. The substantial memory usage may be due to each MPI process having a copy of the external file in memory. This high demand for MPI processes and memory makes running UFS_UTIL on non-HPC systems nearly impossible. Is it possible to improve the computational efficiency of sfc_climo_gen?
P.S. chgres_cube, the code of which appears similar to sfc_climo_gen, can run with just 6 MPI processes.
The text was updated successfully, but these errors were encountered: