You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First, what we use in operations now is this: (we set exclhost, not excl)
#PBS -l place=vscatter:exclhost,select=15:ncpus=120:mpiprocs=120
Options for the place statement:
Modifer Meaning
free Place job on any vnode(s)
pack All chunks will be taken from one host
scatter Only one chunk is taken from any host
vscatter Only one chunk is taken from any vnode. Each chunk must fit on a vnode.
excl Only this job uses the vnodes chosen
exclhost The entire host is allocated to this job
shared This job can share the vnodes chosen
Second - we can create another tile layout for HYCOM that is more than 1800 tasks. That would require creating another patch.input and changing about another half dozen parm files (blkdat,input, ice_in). Also the scripts would need modifying to know which set of these files to use (based on NTASK).
And would need another hycom executable as it is compiled with NTASKS set. It is NPX * NPY and in the current case that is 450 * 4. See comp_ice.csh.
http://www2.spa.ncep.noaa.gov/bugzilla/show_bug.cgi?id=1297
Currently we are limited to only running the forecast job using a max of 1800 cores due to Cice's hard set NTASK value of 1800.
This hard limit on scalability makes hard to improve the science(decrease time-step or run it faster) and/or fully utilize resources.
For example:
Current reservation line for rtofs_global_forecast_step2
#PBS -l place=vscatter:excl,select=15:ncpus=120:mpiprocs=120
Using only 120 cores out of the allowed 128 per node.
The code is not memory bound so, 120 cores are idling in this case.
The following would be more efficient, but would require a different NTASK
#PBS -l place=vscatter:excl,select=15:ncpus=128:mpiprocs=128
The text was updated successfully, but these errors were encountered: