Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling different checksum file version in CI checks #16

Open
jo-basevi opened this issue Jan 30, 2024 · 1 comment
Open

Handling different checksum file version in CI checks #16

jo-basevi opened this issue Jan 30, 2024 · 1 comment

Comments

@jo-basevi
Copy link
Collaborator

jo-basevi commented Jan 30, 2024

In this ACCESS-NRI/access-om2-configs#2 PR, as part of the CI reproducibility checks, pytests will generate checksum files and compare them against checksums saved on each model configuration branch (e.g. release-1deg_jra55_iaf).

In future, there may be multiple versions schemas for these checksum files, and code updates to generating these checksum files in tests.

Say if the schema and checksum generation in tests is updated, the latest version is 2.0 and checksum file on configuration branch has a earlier version of 1.0. Comparing checksum version 2.0 directly with an older version 1.0 does not make too much sense as 2.0 would have potentially more information. Options could be:

  1. Manually update the checksum files on each model configuration branch to 2.0. Ideally if checksum generation has been updated, the tests should be run against with the new checksum file on each branch to make sure the tests still passes anyway..
  2. In the test code, handle checksum generation for each different version of the schema. If checksum file has an earlier version, generate 2 checksum files:
  • a version 1.0, to compare against the checksum file of the configuration branch.
  • a version 2.0 that will be could be used to update the checksum file in a later commit
  1. Rather than using the checksum file saved on the model configuration branch, regenerate a 2.0 version checksum from running the model from an earlier tag or from some archived output from an earlier tag - potentially stored in /g/data/.

Note with 1 & 2, scheduled CI checks with earlier tags, can't have a new checksum version file updated (as tags are associated with a single commit). In these checks the only options are to generate checksum with version 1.0 to compare with the saved truth on the branch, or regenerate checksum of 2.0 from archived outputs.

Other improvements could to separate the schema and checksum generation code from the test code, so it can be versioned together. The tests could then call earlier versions of this code to generate earlier versions of checksums rather than 1 file handling different checksum version generation depending on the saved schema version and the latest version.

Thanks to @aidanheerdegen and @CodeGat for the brainstorming on this!

@aidanheerdegen
Copy link
Member

See this related discussion ACCESS-NRI/schema#4

@aidanheerdegen aidanheerdegen transferred this issue from ACCESS-NRI/access-om2-configs May 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants