-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
0.25 degree configuration with a different MOM6 parameterization compared to #101 #135
Comments
Thanks @minghangli-uni for documenting those details. Is there a branch or draft PR you can link to with these changes? |
Have you tried running with DT_THERM > 2700? Seems like it could be a good way to improve performance, especially when we start running with BGC. Regarding the comment above,
There is a follow on from this in MOM_input saying "unless THERMO_SPANS_COUPLING is true, in which case DT_THERM can be an integer multiple of the coupling timestep". So I think it's fine to have DT_THERM > dt_cpld. e.g. GFDL OM4_025 uses:
Have we tested the performance using similar timesteps to that? Could always reduce dt_cpld to 1800 if that's a worry. |
The branch is linked here, https://github.com/COSIMA/MOM6-CICE6/tree/025deg_jra55do_ryf_iss135 |
@adele-morrison, were your issues with |
Yes, we only had a problem with DT_THERM in regional cases. I think large DT_THERM in global should be fine.
… On 10 Apr 2024, at 10:25 am, Dougie Squire ***@***.***> wrote:
@adele-morrison, were your issues with DT_THERM related to the open boundaries?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
I havent tried but I am planning to run a test with 4 and 8 times greater than DT.
I've tried multiple tests. I observed that regardless of changes in other timesteps or the value of
|
@minghangli-uni, can you easily test the timing and compare the outputs of runs with longer tracer timesteps? E.g. |
Won't be a problem I think. As the current config is limited by the number of CPU cores (e.g. 288), it will take around 14 hours to receive one model year results. |
Are you saying that you think the model won't run any faster, or that you will do the runs to check (or something else)? |
I am currently investigating the core cap issue. Additionally, I plan to examine whether increasing DT_THERM will impact physical fields. Based on these findings, we will determine the extent to which we can achieve a speedup. |
What is this? |
Look the above. |
@minghangli-uni some of the changes in the branch you've linked to this issue overlap with changes that are being made in ACCESS-NRI/access-om3-configs#48 (i.e. the changes to I think your other changes either require testing and/or require the same change across many repos. These are best handled one at a time. I suggest:
|
@dougiesquire This is a good point. Will follow your suggestion and implement changes accordingly. |
Are you planning to use @micaeljtoliveira's profiling tools for the test runs? |
I will firstly work out the concurrent run and do test runs with increased CPU cores for MOM. This will reduce turn-around time and achieve results in a shorter walltime. Then I will use profiling tools (https://github.com/COSIMA/om3-utils/tree/profiling) to fine tune the optimal process layout for 025 deg configuration. |
@minghangli-uni can this be closed now or is there still a reason to keep it open? |
I am happy to close it now. Thanks @dougiesquire |
MOM_input
Most of the updates are sourced from discussions in namelist-discussion and referencing OM2 technical report. Some major updates are highlighted below,
USE_MEKE=False
)NK=50
.TIDES
NUM_DIAG_COORDS=2
includesz_star
andrho_2
(not sure ifrho_2
is relevant for the current0.25deg
(MOM6 1deg configuration - No output generated for GMOM_JRA.mom6.h.rho2_*.nc ACCESS-NRI/access-om3-configs#40 (comment)))DT
: 1350sDT_THERM
: 1350sWith
THERMO_SPANS_COUPLING = True
, tracer timestep can be integer multiple ofDT
. However, as is mentioned in comments withinMOM6_input
,DT_THERM
should be less than coupling timestep. So we may think about increasing the coupling timestep beyond 1350s (A good question proposed by @dougiesquire ACCESS-NRI/access-om3-configs#48 (comment))DT_THERM
: 2700s can lead to a speedup of 20% for each model year. (The comparison is conducted withDIABATIC_FIRST = False
)Ice initial condition
The ice initial condition is set to "default". Additionally, another experiment with the ice initial condition (following #50) from a 3-hour run of OM2 is running at the same time.
Other params
All the other parameters or namelists remain consistent and up-to-date to https://github.com/COSIMA/MOM6-CICE6/tree/025deg_jra55do_ryf_iss101 and OM2 technical report.
For example in
ice_in
,where
block_size_x = 30, block_size_y = 27
are consistent with OM2 technical report.max_blocks=8
is evaluated by this snippet,GADI consumption:
DT=1350s
, and OM3 withDT_THERM=DT=1350s
, the service units required using OM2 and OM3 are approximately 8.39KSU and 11.2KSU, respectively. This indicates that the current OM3 is slower than OM2 by 33%.DT_THERM=2*DT=2700s
, the service units required using OM3 (8.2KSU) become comparable to those of OM2 (8.39KSU).Limitations
The current configuration runs sequentially and is restricted to a maximum of 288 CPU cores. If the number of CPU cores exceeds this limit, the model will hang without providing any useful information. I am still investigating this issue to determine the cause.
The branch is linked here, https://github.com/COSIMA/MOM6-CICE6/tree/025deg_jra55do_ryf_iss135
The text was updated successfully, but these errors were encountered: