Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changing hashes in CI builds #8

Open
aidanheerdegen opened this issue May 15, 2023 · 6 comments
Open

Changing hashes in CI builds #8

aidanheerdegen opened this issue May 15, 2023 · 6 comments
Labels
priority:high A high priority issue - has an impact on core functionality type:bug Something isn't working type:investigation look into options, do some research!

Comments

@aidanheerdegen
Copy link
Member

It seems we're not always hitting the cache.

This build took just over 2 hours:

https://github.com/ACCESS-NRI/build-ci/actions/runs/4946837403

and the previous build took under an hour:

https://github.com/ACCESS-NRI/build-ci/actions/runs/4944385277

Now the latter should have taken some time, it was building the new access-om3 CI container, but the other run (4946837403) which ran afterwards, took even longer: the access-om3 build still took 44m, but it was the oasis3-mct build that took over 2 hours.

When oasis3-mct was fast it had 40 cache hits, the slow one none. It seems the hashes are changing, e.g.

fast run:

https://github.com/ACCESS-NRI/build-ci/actions/runs/4944385277/jobs/8839800831#step:6:536

#5 9.048 ==> [2023-05-11-05:43:41.148032] Pkg id patchelf-0.16.1-zbywahivtssqlnnczyufknfxnabxrfvn has the following dependents:
#5 9.048 ==> [2023-05-11-05:43:41.148075] - intel-oneapi-compilers-2021.1.2-em65am6zce4ri47ysv7rsq6shv7fegps
#5 9.048 ==> [2023-05-11-05:43:41.148322] Requeueing a build task for patchelf-0.16.1-zbywahivtssqlnnczyufknfxnabxrfvn with status 'queued'
#5 9.048 ==> [2023-05-11-05:43:41.150113] Pkg id intel-oneapi-compilers-2021.1.2-em65am6zce4ri47ysv7rsq6shv7fegps has the following dependents:
#5 9.048 ==> [2023-05-11-05:43:41.150347] Requeueing a build task for intel-oneapi-compilers-2021.1.2-em65am6zce4ri47ysv7rsq6shv7fegps with status 'queued'
#5 9.048 ==> [2023-05-11-05:43:41.150381] Ensure all dependencies know all dependents across specs
#5 9.048 ==> [2023-05-11-05:43:41.152109] Processing patchelf-0.16.1-zbywahivtssqlnnczyufknfxnabxrfvn: task=priority=0, status=dequeued, start=0, #dependencies=0

slow run:

https://github.com/ACCESS-NRI/build-ci/actions/runs/4946837403/jobs/8845431878#step:6:535

#5 14.68 ==> [2023-05-11-10:29:16.438576] Installing patchelf-0.16.1-iyrpni67542n24bta5cvjiyw72zqr7xb
#5 14.68 ==> [2023-05-11-10:29:24.963224] No binary for patchelf-0.16.1-iyrpni67542n24bta5cvjiyw72zqr7xb found: installing from source
#5 14.68 ==> [2023-05-11-10:29:16.398901] build(intel-oneapi-compilers)
#5 14.68 ==> [2023-05-11-10:29:16.399010] build(patchelf)
#5 14.68 ==> [2023-05-11-10:29:16.424460] Initializing the build queue from the build requests
#5 14.68 ==> [2023-05-11-10:29:16.424575] Initializing the build queue for intel-oneapi-compilers
#5 14.68 ==> [2023-05-11-10:29:16.425468] Pkg id patchelf-0.16.1-iyrpni67542n24bta5cvjiyw72zqr7xb has the following dependents:
#5 14.68 ==> [2023-05-11-10:29:16.425510] - intel-oneapi-compilers-2021.1.2-4kvg6tbzep2mwgiuwt64mmnaslqxhmme
#5 14.68 ==> [2023-05-11-10:29:16.426001] Requeueing a build task for patchelf-0.16.1-iyrpni67542n24bta5cvjiyw72zqr7xb with status 'queued'
#5 14.68 ==> [2023-05-11-10:29:16.429908] Pkg id intel-oneapi-compilers-2021.1.2-4kvg6tbzep2mwgiuwt64mmnaslqxhmme has the following dependents:
#5 14.68 ==> [2023-05-11-10:29:16.430538] Requeueing a build task for intel-oneapi-compilers-2021.1.2-4kvg6tbzep2mwgiuwt64mmnaslqxhmme with status 'queued'
#5 14.68 ==> [2023-05-11-10:29:16.430609] Ensure all dependencies know all dependents across specs
#5 14.68 ==> [2023-05-11-10:29:16.432858] Processing patchelf-0.16.1-iyrpni67542n24bta5cvjiyw72zqr7xb: task=priority=0, status=dequeued, start=0, #dependencies=0
#5 14.68 ==> [2023-05-11-10:29:16.433584] Acquiring a write lock on patchelf-0.16.1-iyrpni67542n24bta5cvjiyw72zqr7xb with timeout 1.000ns
#5 14.68 ==> [2023-05-11-10:29:16.433884] patchelf-0.16.1-iyrpni67542n24bta5cvjiyw72zqr7xb is now write locked
#5 14.68 ==> [2023-05-11-10:29:16.438328] Creating stage lock spack-stage-patchelf-0.16.1-iyrpni67542n24bta5cvjiyw72zqr7xb
#5 14.68 ==> [2023-05-11-10:29:16.438666] Searching for binary cache of patchelf-0.16.1-iyrpni67542n24bta5cvjiyw72zqr7xb

The intel-oneapi-compilers-2021.1.2 package changed hash from em65am6zce4ri47ysv7rsq6shv7fegps to 4kvg6tbzep2mwgiuwt64mmnaslqxhmme. I don't know if that is the reason the patchelf hash changed, but given that every package was re-built that is a reasonable place to start.

@micaeljtoliveira
Copy link
Member

micaeljtoliveira commented May 15, 2023

One thing I noticed in the Docker files is that the installation of packages for the base image is done like this:

RUN apt-get update && apt-get install -y ...

This is non-reproducible, as one is updating the apt repos every time one builds the image. This only affects the base image though.

@aidanheerdegen
Copy link
Member Author

I know @echus is a fan of NixOS, which might be a good fit in this case as it gives the required reproducibility.

@micaeljtoliveira
Copy link
Member

It's always possible to pin the versions to install. The other option is not to update the repos. I think the former is preferable, as it's more flexible.

@aidanheerdegen
Copy link
Member Author

It's always possible to pin the versions to install

Agreed. However if NixOS is basically built for this functionality it may be easier to implement and maintain. I thought it was worth exploring as an option.

@micaeljtoliveira
Copy link
Member

You still need to pin the versions somehow and, more importantly, those versions need to be written explicitly in the docker file. Otherwise docker doesn't know when it is necessary to rebuild the corresponding layer.

@aidanheerdegen
Copy link
Member Author

Just FYI, I did run the build a third time and it completed all three builds in about 13m finding all the packages in the cache as we would expect.

https://github.com/ACCESS-NRI/build-ci/actions/runs/4975175610

This doesn't mean it is fixed, simply that if there are no underlying changes that force rebuilds it does work as expected.

@CodeGat CodeGat self-assigned this Aug 14, 2023
@CodeGat CodeGat added type:bug Something isn't working priority:high A high priority issue - has an impact on core functionality type:investigation look into options, do some research! labels Aug 20, 2023
@CodeGat CodeGat removed their assignment Aug 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority:high A high priority issue - has an impact on core functionality type:bug Something isn't working type:investigation look into options, do some research!
Projects
None yet
Development

No branches or pull requests

3 participants