Update the automated max_blocks calculation #954

apcraig · 2024-05-13T23:58:08Z

PR checklist

Short (1 sentence) summary of your PR:
Update the automated max_blocks calculation
Developer(s):
apcraig, anton
Suggest PR reviewers from list in the column to the right.
Please copy the PR test results link or provide a summary of testing completed below.
Testing still underway. Expect full test suite on derecho with intel, gnu, cray to be bit-for-bit.
How much do the PR code changes differ from the unmodified code?
- bit for bit
- different at roundoff level
- more substantial
Does this PR create or have dependencies on Icepack or any other models?
- Yes
- No
Does this PR update the Icepack submodule? If so, the Icepack submodule must point to a hash on Icepack's main branch.
- Yes
- No
Does this PR add any new test cases?
- Yes, some test cases changed to leverage and test max_blocks=-1 implementation
- No
Is the documentation being updated? ("Documentation" includes information on the wiki or in the .rst files from doc/source/, which are used to create the online technical docs at https://readthedocs.org/projects/cice-consortium-cice/. A test build of the technical docs will be performed as part of the PR testing.)
- Yes
- No, does the documentation need to be updated at a later time?
  - Yes
  - No
Please document the changes in detail, including why the changes are made. This will become part of the PR commit log.

Update support for max_blocks=-1. This update computes the blocks required on
each MPI task and then sets that as max_blocks if max_blocks=-1 in namelist.
This is done in ice_distribution and is a function of the decomposition among
other things. Refactor the decomposition computation to defer usage of max_blocks
and eliminate the blockIndex array. Update some indentation formatting in
ice_distribution.F90.

Modify cice.setup and cice_decomp.csh to set max_blocks=-1 unless it's explicitly
defined by the cice.setup -p setting.

Fix a bug in ice_gather_scatter related to zero-ing out of the halo with the
field_loc_noupdate setting. This was zero-ing out the blocks extra times and
there were no problems as long as max_blocks was the same value on all MPI tasks.
With the new implementation of max_blocks=-1, max_blocks can be different values
on different MPI tasks. An error was generated and then the implementation
was fixed so each block on each task is now zeroed out exactly once.

Update diagnostics related to max_block information. Write out the min and max
max_blocks values across MPI tasks.

Add extra allocation/deallocation checks in ice_distribution.F90 and add
a function, ice_memusage_allocErr, to ice_memusage.F90 that checks the
alloc/dealloc return code, writes an error message, and aborts. This
function could be used in other parts of the code as well.

Fix a bug in the io_binary restart output where each task was writing some
output when it should have just been the master task.

Update test cases

Update documentation

…uired on each MPI task and then sets that as max_blocks is max_blocks=-1 in namelist. This is done in ice_distribution and is a function of the decomposition among other things. Refactor the decomposition computation to defer usage of max_blocks and eliminate the blockIndex array. Update some indentation formatting in ice_distribution.F90. Modify cice.setup and cice_decomp.csh to set max_blocks=-1 unless it's explicitly defined by the cice.setup -p setting. Fix a bug in ice_gather_scatter related to zero-ing out of the halo with the field_loc_noupdate setting. This was zero-ing out the blocks extra times and there were no problems as long as max_blocks was the same value on all MPI tasks. With the new implementation of max_blocks=-1, max_blocks can be different values on different MPI tasks. An error was generated and then the implementation was fixed so each block on each task is now zeroed out exactly once. Update diagnostics related to block numbers. Write out the min and max max_blocks values across MPI tasks. Add extra allocation/deallocation checks in ice_distribution.F90 and add a function, ice_memusage_allocErr, to ice_memusage.F90 that checks the alloc/dealloc return code, writes an error message, and aborts. This function could be used in other parts of the code as well. Fix a bug in the io_binary restart output where each task was writing some output when it should have just been the master task. Add sectcart test case.

apcraig · 2024-05-13T23:59:57Z

I still testing, refining the test suite, and updating documentation. But this should represent the code changes I'm proposing. Things are running well. The max_blocks=-1 setting now computes the maximum required blocks on each task and sets the internal max_blocks variable to that value. That means that it uses exactly the amount of memory required and the max_blocks can vary per task. Users can still manually set max_blocks in namelist as before.

apcraig · 2024-05-14T15:38:57Z

Testing results look good. https://github.com/CICE-Consortium/Test-Results/wiki/cice_by_hash_forks#7402dc7f04f98d840890f29f8f02a59f956a8fc2.

This is ready for review and merge.

apcraig · 2024-05-15T15:36:10Z

Could someone do a review on this PR? Would love to get this merged. Then I can start comprehensively testing in preparation for a release. Thanks!

dabail10 · 2024-05-15T15:38:57Z

There is a lot here, so I might have missed something. I'm not going to get a chance to test this out until later (after the workshop). I will approve, but just know I might find stuff later once I have tested.

eclare108213 · 2024-05-15T15:55:26Z

@anton-seaice do you have time to look at this? It's probably after hours there...

anton-seaice

This looks good Tony.

It looks like there is coverage in the tests for both setting max_blocks automatically and from the namelist, and the tests are passing.

It looks like the test cases might be using slightly less memory when max_blocks is set automatically, but the resolution is probably too low to be significant.

anton-seaice · 2024-05-16T04:05:09Z

doc/source/user_guide/ug_implementation.rst

@@ -227,8 +228,7 @@ but the user can overwrite the defaults by manually changing the values in
 information to the log file, and if the block size or max blocks is 
 inconsistent with the task and thread size, the model will abort.  The 


Suggested change

inconsistent with the task and thread size, the model will abort. The

inconsistent with the task and thread size, the model will abort. If ``max_blocks``=-1, the model will calculate the number of blocks needed for each task. ``max_blocks`` can also be set by the user, although this may use extra memory. The

I have updated the documentation here.

anton-seaice · 2024-05-16T04:05:12Z

doc/source/user_guide/ug_implementation.rst

+and chooses a block size, ``block_size_x`` :math:`\times`\ ``block_size_y``,
+and decomposition information ``distribution_type``, ``processor_shape``, 
+and ``distribution_type`` in **ice_in**. ``max_blocks`` is computed
+automatically if set to a value of -1, but it can also be set by the user.


Suggested change

automatically if set to a value of -1, but it can also be set by the user.

I think this sentence should be at the end of the paragraph. If I understand correctly, max_blocks does not impact how the blocks are distributed, the how depends on all the other parameters.

I still think this paragraph is out of order:

The user sets the NTASKS and NTHRDS settings in cice.settings and chooses a block size, block_size_x block_size_y, and decomposition information distribution_type, processor_shape, and distribution_wgt in ice_in. If max_blocks=-1, the model will calculate the number of blocks needed for each task. max_blocks can also be set by the user, although this may use extra memory and the model will abort if max_blocks is set too small for the decomposition. This information is used to determine how the blocks are distributed across the processors, and how the processors are distributed across the grid domain.

"This information" should refer to the information in the first sentence but at the moment it reads like "This information" refers to "max_blocks"

You're right, thanks for checking again. I have refactored that paragraph a bit. I think it's better now. Let me know if you have any concerns.

Looks good. Thankyou

anton-seaice · 2024-05-16T04:17:36Z

cicecore/cicedyn/infrastructure/ice_memusage.F90

+   character(*),parameter  :: subname = '(ice_memusage_allocErr)'
+
+   ice_memusage_allocErr = .false.
+   if (istat > 0) then


Does the value of istat have a meaning? Should we return istat or interpret the error in some way?

I had a look at some documentation. I am changing this to istat /= 0 to be more correct. Couldn't find any info about return codes other than 0 is success.

anton-seaice · 2024-05-16T04:22:28Z

cicecore/cicedyn/infrastructure/ice_memusage.F90

+!===============================================================================
+! check memory alloc/dealloc return code
+
+logical function ice_memusage_allocErr(istat, errstr)


Thanks for adding this ... unhandled errors spook me :)

anton-seaice · 2024-05-16T04:52:00Z

cicecore/shared/ice_distribution.F90

+   ! set/check max_blocks
+   if (lmax_blocks_calc) then
+      if (max_blocks < 0) then
+         max_blocks = newDistrb%numLocalBlocks


Is calling this variable max_blocks confusing? Isn't it just the number of blocks for this task ? Where max_blocks means the maximum number for all the tasks ?

max_blocks is used throughout the code as the number of blocks to allocate on each task. It used to be the same value on all tasks, set by the user. Now it's computed internally and set uniquely for each task. Setting max_blocks in line 683 is doing exactly that. Because the rake decomp calls the cartesian decomp, I needed to add an option where I could turn off the setting of max_blocks in cart because I calculate it in rake. But if cart is called directly, then max_blocks is set. Every decomp option sets the max_blocks variable.

I am confused here. I thought nblocks was meant to be the number of blocks on each task and this could be different. The parameter max_blocks should be the maximum number of blocks across all tasks, no?

@dabail10, you are correct. In general, nblocks is the number of active blocks on each task. max_blocks is the number of blocks that are allocated on each task. Historically, max_blocks was used to allocate memory, was the same value on all tasks, and was set by a CPP at build time. While nblocks was used to loop over only active blocks at run time. The code still uses those two variables as they've always been defined.

But in the last few years, we moved to dynamic allocation of memory (moving max_blocks to namelist) and with the current PR, we are able to compute max_blocks internally on each task BEFORE we need to use it to allocate memory. So in that case, nblocks and max_blocks can overlap in function.

However, we still support user defined max_blocks (although we probably don't need to) which means we still want to differentiate max_blocks and nblocks.

Maybe the next step is to ignore the max_blocks namelist setting and always set it internally. Once we do that, we could, in theory, unify nblocks and max_blocks in the code to a single variable. But we're not quite there yet, and I think we could debate whether all that refactoring would be worth the effort. max_blocks and nblocks still play different roles (memory allocation vs active block count), and we're all pretty comfortable with that scheme.

Having a "max_blocks" in the namelist and a "max_blocks" in the code with different meanings is confusing. I see that it may not be worth it to unify max_blocks and nblocks (there are ~1000 uses of max_blocks), so I guess we just run with it as is.

This comment in ice_domain_size is now wrong:

max_blocks , & ! max number of blocks per processor

and maybe could be number of blocks memory is allocated for or similar?

anton-seaice · 2024-05-16T04:57:29Z

cicecore/shared/ice_distribution.F90

-      call abort_ice(subname//'ERROR: processors left with no blocks')
+   newDistrb%numLocalBlocks = newDistrb%blockCnt(my_task+1)
+   if (newDistrb%numLocalBlocks < 0) then
+      call abort_ice(subname//'ERROR: processors left with no blocks', &


Why do we abort here, but not say if there are no processors left with no blocks for a cartesian distribution ?

I left the implementation as it was. Looks like rake is the only one that checks that there has to be a block on each task. I think I'll remove that constraint now. There is already a test that verifies the model runs fine with zero blocks on a task, so we've got that covered in the test suite. Good catch.

anton-seaice · 2024-05-16T04:58:11Z

cicecore/shared/ice_distribution.F90

-      call abort_ice(subname//'ERROR: processors left with no blocks')
+   newDistrb%numLocalBlocks = newDistrb%blockCnt(my_task+1)
+   if (newDistrb%numLocalBlocks < 0) then
+      call abort_ice(subname//'ERROR: processors left with no blocks', &


Suggested change

call abort_ice(subname//'ERROR: processors left with no blocks', &

call abort_ice(subname//'ERROR: tasks left with no blocks', &

I am not totally on top of threading, but should this be tasks ?

This code has been removed.

anton-seaice · 2024-05-16T05:23:35Z

cicecore/shared/ice_distribution.F90

-      enddo
+   ! set/check max_blocks
+   if (max_blocks < 0) then
+      max_blocks = newDistrb%numLocalBlocks


Its kind of silly for roundobin, but should we check the processor has work (i.e. numLocalBlocks /= 0)

Having no work on a task is allowed. In my opinion, it's up to the user to properly tune the number of processors and decomposition.

apcraig · 2024-05-16T17:29:42Z

I have update the PR based on feedback from @anton-seaice and am running a set of tests just to make sure nothing is broken. Will report results when the testing is done. Thanks @anton-seaice for the comments.

apcraig · 2024-05-16T23:14:26Z

I reran a portion of the test suite with the latest code changes and I think everything is OK. I'll merge once github actions passes and @anton-seaice is happy with the current implementation. Please let me know if anything else needs to be fixed. Thanks!

anton-seaice · 2024-05-16T23:26:24Z

There just a couple if lines in ice_domain_size that are not totally consistent now:

    max_blocks  , & ! max number of blocks per processor

Could be updated

   !*** The model will inform the user of the correct
   !*** values for the parameter below.  A value higher than
   !*** necessary will not cause the code to fail, but will
   !*** allocate more memory than is necessary.  A value that
   !*** is too low will cause the code to exit.
   !*** A good initial guess is found using
   !*** max_blocks = (nx_global/block_size_x)*(ny_global/block_size_y)/
   !***               num_procs

Can probably be removed because its covered in the docs ?

apcraig · 2024-05-16T23:34:41Z

There just a couple if lines in ice_domain_size that are not totally consistent now:

    max_blocks  , & ! max number of blocks per processor

Could be updated

   !*** The model will inform the user of the correct
   !*** values for the parameter below.  A value higher than
   !*** necessary will not cause the code to fail, but will
   !*** allocate more memory than is necessary.  A value that
   !*** is too low will cause the code to exit.
   !*** A good initial guess is found using
   !*** max_blocks = (nx_global/block_size_x)*(ny_global/block_size_y)/
   !***               num_procs

Can probably be removed because its covered in the docs ?

good catch, fixed these.

anton-seaice · 2024-05-17T00:35:28Z

good catch, fixed these.

I think you still need to push the commit

anton-seaice

Thanks Tony!

apcraig added 3 commits May 13, 2024 11:38

Cleanup some changes and update tests

0bd3ced

Update test suite to test max_blocks=-1 with more decompositions

7402dc7

apcraig requested review from eclare108213, dabail10, phil-blain and anton-seaice May 13, 2024 23:58

apcraig added Software Engineering Scripts Grids labels May 14, 2024

Update documentation

3f25267

apcraig marked this pull request as ready for review May 14, 2024 15:38

dabail10 approved these changes May 15, 2024

View reviewed changes

anton-seaice reviewed May 16, 2024

View reviewed changes

Update a decomp check, allocate return code check, and documentation

0cc5d02

update documentation

7a89af6

update documentation

5764c45

anton-seaice approved these changes May 17, 2024

View reviewed changes

apcraig merged commit 969a76d into CICE-Consortium:main May 17, 2024
2 checks passed

NickSzapiro-NOAA mentioned this pull request Jun 20, 2024

Sync and merge with Consortium/main (2024-09-01) NOAA-EMC/CICE#82

Closed

anton-seaice mentioned this pull request Aug 29, 2024

[0.25deg] update PE layout COSIMA/access-om3#214

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update the automated max_blocks calculation #954

Update the automated max_blocks calculation #954

apcraig commented May 13, 2024 •

edited

Loading

apcraig commented May 13, 2024

apcraig commented May 14, 2024

apcraig commented May 15, 2024

dabail10 commented May 15, 2024

eclare108213 commented May 15, 2024

anton-seaice left a comment

anton-seaice May 16, 2024

apcraig May 16, 2024

anton-seaice May 16, 2024

anton-seaice May 16, 2024 •

edited

Loading

apcraig May 16, 2024

anton-seaice May 16, 2024

anton-seaice May 16, 2024

apcraig May 16, 2024

anton-seaice May 16, 2024

anton-seaice May 16, 2024

apcraig May 16, 2024

dabail10 May 16, 2024

apcraig May 16, 2024

anton-seaice May 16, 2024

anton-seaice May 16, 2024

apcraig May 16, 2024

anton-seaice May 16, 2024

apcraig May 16, 2024

anton-seaice May 16, 2024

apcraig May 16, 2024

apcraig commented May 16, 2024

apcraig commented May 16, 2024

anton-seaice commented May 16, 2024

apcraig commented May 16, 2024

anton-seaice commented May 17, 2024

anton-seaice left a comment

		@@ -227,8 +228,7 @@ but the user can overwrite the defaults by manually changing the values in
		information to the log file, and if the block size or max blocks is
		inconsistent with the task and thread size, the model will abort. The

	call abort_ice(subname//'ERROR: processors left with no blocks', &
	call abort_ice(subname//'ERROR: tasks left with no blocks', &

Update the automated max_blocks calculation #954

Update the automated max_blocks calculation #954

Conversation

apcraig commented May 13, 2024 • edited Loading

PR checklist

apcraig commented May 13, 2024

apcraig commented May 14, 2024

apcraig commented May 15, 2024

dabail10 commented May 15, 2024

eclare108213 commented May 15, 2024

anton-seaice left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anton-seaice May 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

apcraig commented May 16, 2024

apcraig commented May 16, 2024

anton-seaice commented May 16, 2024

apcraig commented May 16, 2024

anton-seaice commented May 17, 2024

anton-seaice left a comment

Choose a reason for hiding this comment

apcraig commented May 13, 2024 •

edited

Loading

anton-seaice May 16, 2024 •

edited

Loading