-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support larger mesh sizes in the NetCDF-C mesh converter #595
base: master
Are you sure you want to change the base?
Conversation
@changliao1025 pointed me to a mesh on Compy:
with
This turns out to be because of an
This product overflows the maximum signed int and becomes a (large) negative number. I am still testing the changes here but the fixes to that particular line |
The `nCells`, `nEdges` and `nVertices` dimensions now are of type `size_t` in the c++ code. Products of these dimensions with other dimensions (`maxEdges2`, `two`, `vertexDegree`, etc.) are computed as `size_t` variables and then used to allocate arrays to avoid `int` overflow. These changes are likely not sufficient to allow meshes with more than ~2 billion edges (or vertices or cells) because index arrays are still stored as `int` type.
ccfeab9
to
29ddc41
Compare
@xylar Thank you for identifying the bug and providing a fix for it. The large mesh I attempted to generate is an improved mesh workflow designed to consider many hydrologic features (land surface, river, lake with boundary, dam, cities, watershed boundary). Due to the land surface heterogeneity, these features require a relatively high-resolution mesh (~5km) to resolve. Therefore, the global mesh size is significantly increased. |
TestingI was able to run the mesh converter with this branch (I did a local build of the conda package):
|
@mark-petersen, @matthewhoffman, @erinethomas, I have chosen the 3 of you to review this as representatives of the MPAS components in E3SM. I don't believe this will have any unexpected results but I have not yet done any BFB tests yet. I will do that right away. |
I ran
I see:
|
@mark-petersen, @matthewhoffman and @erinethomas, if it helps with the review, I'm happy to build you an I'm also fine with a visual inspection of the code if you prefer, but I know you all are not necessarily C++ experts. |
A note on size_t for your info:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xylar , I reviewed based on inspection and this makes complete sense and is the solution I would have expected to the problem. I am comfortable with the testing you've reported and do not feel the need to test myself.
@mark-petersen and @erinethomas, a reminder that I'd appreciate your review on this or if you would suggest an alternative reviewer. |
@mark-petersen and @erinethomas, I made a test conda environment with this
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approve this PR based on the following testing:
I loaded three different conda environments on chrysalis with unique versions of MPAS_Tools (the 'default' MPAS_tools from e3sm unified environment, 'Xylars' environment listed above, and my own local built conda environment with this branch).
I then ran MpasMeshConverter.x on one of my own wave mesh files, using each of the above conda environments.
I used the following python script (Based on Xylar's python script above) to determine the differences in the resulting mesh files
#!/usr/bin/env python
import numpy as np
import xarray as xr
ds_old = xr.open_dataset('waves_mesh_OUT_DEFAULT_TEST.nc')
ds_new = xr.open_dataset('waves_mesh_OUT_XYLAR_TEST.nc')
print('DIFFERENCES BTW XYLAR ENV and DEFAULT')
for var_name in ds_old:
da_old = ds_old[var_name]
da_new = ds_new[var_name]
max_diff = np.abs(da_old - da_new).max().values
print(f'{var_name}: {max_diff}')
print('DIFFERENCES BTW ERIN ENV and DEFAULT')
ds_new = xr.open_dataset('waves_mesh_OUT_ERIN_TEST.nc')
for var_name in ds_old:
da_old = ds_old[var_name]
da_new = ds_new[var_name]
max_diff = np.abs(da_old - da_new).max().values
print(f'{var_name}: {max_diff}')
I find bit-for-bit results in all three versions of MpasMeshConverter as seen here:
DIFFERENCES BTW XYLAR ENV and DEFAULT
latCell: 0.0
lonCell: 0.0
xCell: 0.0
yCell: 0.0
zCell: 0.0
indexToCellID: 0
latEdge: 0.0
lonEdge: 0.0
xEdge: 0.0
yEdge: 0.0
zEdge: 0.0
indexToEdgeID: 0
latVertex: 0.0
lonVertex: 0.0
xVertex: 0.0
yVertex: 0.0
zVertex: 0.0
indexToVertexID: 0
cellsOnCell: 0
edgesOnCell: 0
verticesOnCell: 0
nEdgesOnCell: 0
edgesOnEdge: 0
cellsOnEdge: 0
verticesOnEdge: 0
nEdgesOnEdge: 0
cellsOnVertex: 0
edgesOnVertex: 0
boundaryVertex: 0
areaCell: 0.0
angleEdge: 0.0
dcEdge: 0.0
dvEdge: 0.0
weightsOnEdge: 0.0
areaTriangle: 0.0
kiteAreasOnVertex: 0.0
cellQuality: 0.0
gridSpacing: 0.0
triangleQuality: 0.0
triangleAngleQuality: 0.0
obtuseTriangle: 0
meshDensity: 0.0
DIFFERENCES BTW ERIN ENV and DEFAULT
latCell: 0.0
lonCell: 0.0
xCell: 0.0
yCell: 0.0
zCell: 0.0
indexToCellID: 0
latEdge: 0.0
lonEdge: 0.0
xEdge: 0.0
yEdge: 0.0
zEdge: 0.0
indexToEdgeID: 0
latVertex: 0.0
lonVertex: 0.0
xVertex: 0.0
yVertex: 0.0
zVertex: 0.0
indexToVertexID: 0
cellsOnCell: 0
edgesOnCell: 0
verticesOnCell: 0
nEdgesOnCell: 0
edgesOnEdge: 0
cellsOnEdge: 0
verticesOnEdge: 0
nEdgesOnEdge: 0
cellsOnVertex: 0
edgesOnVertex: 0
boundaryVertex: 0
areaCell: 0.0
angleEdge: 0.0
dcEdge: 0.0
dvEdge: 0.0
weightsOnEdge: 0.0
areaTriangle: 0.0
kiteAreasOnVertex: 0.0
cellQuality: 0.0
gridSpacing: 0.0
triangleQuality: 0.0
triangleAngleQuality: 0.0
obtuseTriangle: 0
meshDensity: 0.0
Thanks so much @erinethomas. That's a nice test to have that we're doing no harm here! |
The
nCells
,nEdges
andnVertices
dimensions now are of typesize_t
in the c++ code. Products of these dimensions with other dimensions (maxEdges2
,two
,vertexDegree
, etc.) are computed assize_t
variables and then used to allocate arrays to avoidint
overflow.These changes are likely not sufficient to allow meshes with more than ~2 billion edges (or vertices or cells) because index arrays are still stored as
int
type.