Add CI workflow #1

aidanheerdegen · 2023-11-21T04:04:54Z

We need a continuous integration workflow to run model tests on HPC infrastructure. This should be a reusable workflow/action as we need to add the same CI tests to other repositories.

This workflow would be triggered from pull requests that update configurations.

There would be levels of testing possible: the fastest and easiest is bit wise reproducibility testing where the model is run for a very short time and checksums in the outputs compared to checksums from a "ground truth" run.

The more time intensive testing would be running the model for longer and testing for performance degradation. Again this would involve extracting timing information from the logs and comparing against a "ground truth" run.

The mechanism for doing this was what this repo and issue is about: ACCESS-NRI/model-config-tests#83.

We also need a mechanism to update the "ground truth" when there is a known breaking change to the model.

In some cases those pull requests would be generated from new pre-release deployments.

These CI checks can't be run automatically from PRs due to possible security issues. So we either make the CI fire only on PRs to protected branches that limit who can make PRs, or we don't let the fire without an approval process.

aidanheerdegen · 2023-11-23T01:02:32Z

This is my go at encapsulating this as a flow diagram

flowchart TD
subgraph "Repo: ACCESS-OM2"
pre-release-om2[Pre-Release ACCESS-OM2]
end
subgraph "Repo: ACCESS-OM2-configs"
PR["PR: Update model exe paths"]
CI["Run config on gadi"]
QReproduce{"Run reproduces?"}
QBumpMajorVersion{"Bump major version?"}
BumpMajorVersion["Increment major version
               and save new truth"]
QBumpMinorVersion{"Bump minor version?"}
BumpMinorVersion["Increment minor version"]
MergeConfig["Merge PR to main"]
QNewVersion{"New version?"}
ReleaseAndDeploy["Release and deploy to gadi"]
end
subgraph "Repo: ACCESS-OM2"
release-om2[Tag and Release ACCESS-OM2]
end
pre-release-om2 --> PR
PR --> CI --> QReproduce
QReproduce -- no --> QBumpMajorVersion
QBumpMajorVersion -- yes --> BumpMajorVersion
QBumpMajorVersion -- no --> PR
QReproduce -- yes --> QBumpMinorVersion
QBumpMinorVersion -- yes --> BumpMinorVersion
QBumpMinorVersion -- no --> MergeConfig
BumpMinorVersion --> MergeConfig
BumpMajorVersion --> MergeConfig --> QNewVersion
QNewVersion -- yes --> ReleaseAndDeploy
ReleaseAndDeploy --> release-om2

aidanheerdegen · 2023-11-23T01:22:15Z

Code for above diagram is available if hit the "copy" button if you want to hack about with it (it is mermaid.js so just put mermaid after the 3 ticks to get it to render)

CodeGat · 2023-11-23T02:31:51Z

Under what circumstance would the run fail to reproduce, but we also don't want to bump to a new major version? What does the flow diagram mean when we are going back to update model exe paths?

aidanheerdegen · 2023-11-23T06:54:11Z

Under what circumstance would the run fail to reproduce, but we also don't want to bump to a new major version?

If we wanted to change the model code so that it did reproduce. So the non-reproduction was accidental.

What does the flow diagram mean when we are going back to update model exe paths?

This shows the limitation of what I've done, because the PR to access-om2-configs should be a CI step, that we can make pass with updates to the code.

I'm not even sure how likely that is to be honest. Part of the reason for making this issue/diagram.

aidanheerdegen · 2024-01-17T03:30:49Z

The decision was made to prepend release- to any released configuration to make it easier to protect those branches, and also to clearly signal that these are released versions of configurations.

There was some discussion if development versions of the same configuration should have dev- prepended to make branch protection rules cleaner, and CI logic easier to write. I don't know if this is a good idea or not. We're making branches with very long names to make CI easier, does it make life more difficult for users?

The design choice is to have the version of a configuration stored in a file (VERSION) in the repo itself. This would be a normal semantic version, with MAJOR version changes when configurations are no longer bitwise reproducible. GitHub tags would then be created which maps to this VERSION. Because tags are global across a repository, these tags would have to have the configuration name and the VERSION, e.g. release-1deg-jra55-iaf_1.1. CI would be responsible for generating these tags when PRs are merged.

It might be desirable to use / separators to create tag hierarchies, to make it more easily understandable, e.g.

release/1deg-jra55-iaf/1.1
release/1deg-jra55-iaf/2.0
release/1deg-jra55-ryf/1.0
release/1deg-jra55-ryf/1.1

The current approach uses branch names with an underscore (_) separator, because / might not play nicely with payu clone which uses branch-name when creating laboratory directories. It is possible to substitute _ for /, but sounds a bit hokey, and it is potentially confusing to use a different convention for branch names and tags.

There was the possibility of using / separators for branch names too, and alter payu so that it ignores anything "pathlike" in the branch name. Hrm.

CodeGat · 2024-01-17T03:37:03Z

I agree with the / separator between aspects of a tag. Although just to correct, we are currently using - as a separator, not _. For example, we are currently doing release-1deg_jra55_iaf-1.0.0.

CodeGat · 2024-01-25T02:23:10Z

So for now, will we continue with - for separators in tags? Or move to / separators? @jo-basevi will this affect anything in payu?

jo-basevi · 2024-01-25T04:23:20Z

So for now, will we continue with - for separators in tags? Or move to / separators? @jo-basevi will this affect anything in payu?

Currently with the latest version of payu, the model output directories use $control-$branchname-$uuid for naming sub-directories (where control is the name of the directory from which payu run was run in). I've checked and adding / will create directory trees.. Could add in option to payu to substitute / for -or _ in the branch names.

aidanheerdegen · 2024-01-25T06:34:41Z

Important to make sure we're not confusing branch names with tag names. I know I was getting myself confused. / should be fine in tags I think, not great in branch names because of the aforementioned directory tree creation when payu uses a branch name in an experiment name.

CodeGat · 2024-01-28T22:57:57Z

Adding a comment regarding the latest update regarding checksums.

We won't be storing just the result of the CHECKSUM, but potentially the result of other tests, as well. So we will update the repo structure (for release-* and dev-* branches) to:

./testing
  |- checksum
  |  |- <potentially multiple checksums>
  |  `- test_report.xml
  |- performance
  |  |- <performance testing results>
  |  `- <maybe other stuff>
  |- <other tests>
  `- <even more tests>

Instead of just ./CHECKSUM.

Tagging @jo-basevi and @aidanheerdegen

aidanheerdegen · 2024-02-02T05:04:29Z

So this doesn't get lost in resolved conversations

#2 (comment)

I did wonder if an alternate approach could be to simply tag the releases we want to do scheduled checks on? I have decided this is a bad idea: there is no audit trail for adding or deleting tags, or a way of checking and approving them. Also it would mean deleting tags. Also tags have to be unique, so we would end up with stuff like check-month-3 which is nasty.

CodeGat self-assigned this Dec 19, 2023

CodeGat mentioned this issue Dec 19, 2023

Model Repro CI #2

Merged

CodeGat added type:feature priority:blocker labels Jan 12, 2024

CodeGat mentioned this issue Jan 18, 2024

Added Reproducibility-focussed Reusable Workflows and Actions ACCESS-NRI/reproducibility#6

Merged

CodeGat mentioned this issue Jan 29, 2024

Move scripts to sub-directory #4

Closed

CodeGat closed this as completed in #2 Feb 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CI workflow #1

Add CI workflow #1

aidanheerdegen commented Nov 21, 2023 •

edited

Loading

aidanheerdegen commented Nov 23, 2023 •

edited

Loading

aidanheerdegen commented Nov 23, 2023 •

edited

Loading

CodeGat commented Nov 23, 2023

aidanheerdegen commented Nov 23, 2023

aidanheerdegen commented Jan 17, 2024

CodeGat commented Jan 17, 2024

CodeGat commented Jan 25, 2024

jo-basevi commented Jan 25, 2024

aidanheerdegen commented Jan 25, 2024

CodeGat commented Jan 28, 2024 •

edited

Loading

aidanheerdegen commented Feb 2, 2024

Add CI workflow #1

Add CI workflow #1

Comments

aidanheerdegen commented Nov 21, 2023 • edited Loading

aidanheerdegen commented Nov 23, 2023 • edited Loading

aidanheerdegen commented Nov 23, 2023 • edited Loading

CodeGat commented Nov 23, 2023

aidanheerdegen commented Nov 23, 2023

aidanheerdegen commented Jan 17, 2024

CodeGat commented Jan 17, 2024

CodeGat commented Jan 25, 2024

jo-basevi commented Jan 25, 2024

aidanheerdegen commented Jan 25, 2024

CodeGat commented Jan 28, 2024 • edited Loading

aidanheerdegen commented Feb 2, 2024

aidanheerdegen commented Nov 21, 2023 •

edited

Loading

aidanheerdegen commented Nov 23, 2023 •

edited

Loading

aidanheerdegen commented Nov 23, 2023 •

edited

Loading

CodeGat commented Jan 28, 2024 •

edited

Loading