Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gcc fix: replacing cray pointers with fortran pointers in mpp_scatter and gather #1312

Merged
merged 3 commits into from
Aug 17, 2023

Conversation

laurenchilutti
Copy link
Contributor

@laurenchilutti laurenchilutti commented Jul 31, 2023

Description
This PR fixes
test_bc_restarts
test_fms2_io(all)
test_io_with_mask

All of these were related to the use of cray pointers in mpp_scatter and mpp_gather.

subroutine MPP_GATHER_PELIST_2D_(is, ie, js, je, pelist, array_seg, data, is_root_pe, &
ishift, jshift)
integer, intent(in) :: is, ie, js, je
integer, dimension(:), intent(in) :: pelist
MPP_TYPE_, dimension(is:ie,js:je), intent(in) :: array_seg
MPP_TYPE_, dimension(:,:), intent(inout) :: data
logical, intent(in) :: is_root_pe
integer, optional, intent(in) :: ishift, jshift
MPP_TYPE_ :: arr3D(size(array_seg,1),size(array_seg,2),1)
MPP_TYPE_ :: data3D(size( data,1),size( data,2),1)
pointer( aptr, arr3D )
pointer( dptr, data3D )
aptr = LOC(array_seg)
dptr = LOC( data)
call mpp_gather(is, ie, js, je, 1, pelist, arr3D, data3D, is_root_pe, &
ishift, jshift)
return
end subroutine MPP_GATHER_PELIST_2D_

A seg fault occurs at line 126 when we point ot the location of data. datawill only be allocated on the root pe and all other PEs seg fault. To fix this, I have replaced cray pointers with fortran pointers and have used pointer rank remapping to point a 3D pointer to a 2D input field.

Also includes a change to compressed_write.F90 where we need to surround calls to netcdf_write_data with if (fileobj%is_root). When running the unit tests with intel's ifort and ifx compilers, multiple test_fms_io tests fail with an error related to buffer variables not being allocated. We determined that this was because of the FMS 2023.01.01 updates, in which we only allocate these buffer variables on root pe, use mpp_gather to collect a value for this buffer, and then netcdf_write_data to write this data using the buffer.

Other changes include increasing size of filename to allow for longer filename and fixing some errors in test_compressed_writes.F90 and test_domain_io.F90's register_field_wrapper functions.

Fixes #1276

How Has This Been Tested?
Has been tested with GCC 13 with -O2 flags
make check

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • Any dependent changes have been merged and published in downstream modules
  • New check tests, if applicable, are included
  • make distcheck passes

…gather. Only pointing if on root pe for gather and scatter. Enlarging the size of filename in test_domain_io.F90 to allow for longer filenames. Consistently creating input nml in test_bc_restart.sh
@laurenchilutti laurenchilutti changed the title replacing cray pointers with fortran pointers in mpp_scatter and mpp_… Gcc fix: replacing cray pointers with fortran pointers in mpp_scatter and gather Jul 31, 2023
@bensonr
Copy link
Contributor

bensonr commented Aug 4, 2023

@laurenchilutti - I haven't looked at the test code source, but it would be good to know if there are standalone gather/scatter tests that call the the routines directly with a data-domain sized array passing only the compute domain portion as an argument.

@laurenchilutti
Copy link
Contributor Author

I have just merged in the main branch and made an additional commit to fix some intel failures we see when using -check all flags. I will update the description of this PR to elaborate.

@laurenchilutti
Copy link
Contributor Author

@laurenchilutti - I haven't looked at the test code source, but it would be good to know if there are standalone gather/scatter tests that call the the routines directly with a data-domain sized array passing only the compute domain portion as an argument.

We test mpp_scatter and mpp_gather in https://github.com/NOAA-GFDL/FMS/blob/main/test_fms/mpp/test_mpp_gatscat.F90 but you will notice that the fms2io uses of mpp_gather and mpp_scatter are different than in the test. For one: In fms2io we are only allocating the data arrays on the root pe before calling mpp_gather, but in these tests I see that they allocate the data arrays on all pes, so we would never have caught the cray pointer seg faults.

Comment on lines +114 to +119
integer, intent(in) :: is, ie, js, je
integer, dimension(:), intent(in) :: pelist
MPP_TYPE_, dimension(is:ie,js:je), target, intent(in) :: array_seg
MPP_TYPE_, dimension(:,:), contiguous, target, intent(inout) :: data
logical, intent(in) :: is_root_pe
integer, optional, intent(in) :: ishift, jshift
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these lines changed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is whitepace changes to keep the alignment/formatting. If you select "hide whitespace" under the Files Changed Tab -> Gear Icon these changes are ignored.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I type git blame, the changes won't be ignored...

I'm not saying this NEEDS to be changed, but you will own these lines of code, and it will be unclear why it's lumped together with cray pointer removal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm aware, and if I am ever asked about these lines of code, is there a problem with me stating that I changed whitespace? I don't see a benefit to redoing this just so I don't "own" the code.

@rem1776 rem1776 merged commit 8d75797 into NOAA-GFDL:main Aug 17, 2023
17 of 18 checks passed
uramirez8707 pushed a commit to uramirez8707/FMS that referenced this pull request Nov 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

unit test failures with gcc + O2 optimization
4 participants