-
Notifications
You must be signed in to change notification settings - Fork 318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FATES restart fix for long runs #2199
Conversation
@rgknox @glemieux @adrifoster @ckoven do we want to expedite bringing this in as it's a critical fix? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I request a small change for readability that won't be hard to bring in.
I'm really glad you were able to figure this out! Thanks to everyone who worked on this! @mvdebolskiy @glemieux @mvertens I know for sure...
With testing there are tests that will start passing that can be removed from the expected fails. It also sounds like you setup a special test to find this -- should that come in as well?
So the things I see:
- Update FATES tag
- Add keyword syntax to the .false./.true. calls.
- Add a comment to the one that's different the .true. one.
- Remove expected fails that now work (requires FATES tests to be run and this to be checked)
- Add a new test for this
Those all sound like great suggestions @ekluzek . I'm working through designing a new test right now. I had originally wanted to go all-out and create a finidat file to be used in the new test's initialization (which would allow us to encounter the salient model mechanics in a shorter run), but I'm going to circumvent that for now because that is a larger topic/issue. I'll check back in when I have a new test created. In short I'm hoping I can trigger the original error with an f10 grid and a few months. Ideally, I'd like to see a test similar to an ERS f10 Lm13 in there, if it does not take much longer than the other tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a reminder to update externals, and I think it is good to go. Will just need to make an issue for adding a test that starts with really spun-up restart file.
I tested a bunch of configurations to get a 25 month test that completes in a reasonable amount of time, I found that using the FatesColdNoComp testdef is fairly effective at doing this. I tested different pe layouts and found that with: |
all aux_clm pass, izumi and cheyenne, preparing changelogs |
@rgknox I realized this isn't pointing to a FATES tag. And it looks like the FATES tag hasn't been made yet. Could you do that and update the branch? Thanks... |
@rgknox I'm making the tag now. But, we will need you to rename the test directory to under the baselines for both Cheyenne and izumi to ctsm5.1.dev151. There were several directories and I wasn't sure which was the right one, and there seemed to be permission issues when I tried to copy them over. So if you could do that Monday that would be great. |
@rgknox also was the FATES test list run for this? I expect that some of the FATES expected fails, might now work? Or is that not the case? Monday would be good to look into this and it probably would be good to run the FATES test suite then as well. If more of the FATES test pass, we can mark them as working in the next tag. For now I'll check that thing off the list with a note about it. |
Description of changes
FATES had been calling a routine (canopy_structure()) during the restart read process. The purpose was to facilitate rebuilding the canopy. However, this had an inadvertent effect of changing results from base, because this procedure would not just rebuild, but modify the cohort compositions due to termination and promotion/demotion in the canopy. We realized that we could simply bypass this call, and add two variables to the restart file to achieve b4b restarts in tests.
This needs to be synchronized with: NGEET/fates#1098
Specific notes
Contributors other than yourself, if any: @mvdebolskiy identified the locus of the problem which enabled its fix, with team contributions and troubleshooting from @mvertens , @glemieux , @rosiealice and @ckoven
CTSM Issues Fixed (include github issue #):
NGEET/fates#1051
Are answers expected to change (and if so in what way)?
No, this is just a bug fix.
Any User Interface Changes (namelist or namelist defaults changes)? No
Testing performed, if any:
We constructed a new test that was simply long enough to enable more canopy development (at least that is the hypothesis). The key is to have a FATES ERS test that is an LM5 (5 month). However, the need for a long test would likely be circumvented if we had an initial condition file that had been generated after long simulation (ie at least a few months of cold start spinup, but longer would probably be better (ie years)).