Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experimental: add a proxy to increase CI resiliency to third party services failures #3580

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

apostasie
Copy link
Contributor

CI fails regularly because of third-party services transient errors.

Most typically, debian or ubuntu servers, or Docker Hub, returning a 500 during the build phase.

This PR is an experimental proposal to alleviate a bit of the pain with this problem, by adding a local proxy that will retry backends requests on such failures.

The proxy also provisionally does caching for debian and ubuntu domains. Currently, this is not going to do much, although it is good practice IMHO to minimize hammering debian servers.

Besides this PR, we might want to rethink our strategy with building the test image though.
Right now, we build everything once per target.
This is happening in parallel, but this is (obviously) significantly increasing the chances of failures against these services.

Note that this proxy will intercept requests done from the host and from the build phase - not for the tests themselves.

@AkihiroSuda
Copy link
Member

AkihiroSuda commented Oct 21, 2024

Besides this PR, we might want to rethink our strategy with building the test image though.

Can we just use https://docs.docker.com/build/cache/backends/gha/ ?
Then probably no need to set up a proxy

@apostasie
Copy link
Contributor Author

Besides this PR, we might want to rethink our strategy with building the test image though.

Can we just use https://docs.docker.com/build/cache/backends/gha/ ? Then probably no need to set up a proxy

Will definitely give it a try. I am not optimistic that we will fit in the limitations (especially if we try to mode=max) - but let's see.

@apostasie
Copy link
Contributor Author

apostasie commented Oct 21, 2024

Besides this PR, we might want to rethink our strategy with building the test image though.

Can we just use https://docs.docker.com/build/cache/backends/gha/ ? Then probably no need to set up a proxy

Will definitely give it a try. I am not optimistic that we will fit in the limitations (especially if we try to mode=max) - but let's see.

Notes for later:

  • 10G is the limit - cache will get evicted once we hit that
  • a PR can use cached entries from the main branch, but not from other PRs
  • with mode=max, image cache is about 2G compressed - 2.9G uncompressed - with currently 8 images (rootless, rootful, containerd/ubuntu versions), this will not work - we have to rewrite the image build logic and cache a common target instead

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants