Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-18500 Build PRs at HEAD commit #18449

Open
wants to merge 21 commits into
base: trunk
Choose a base branch
from

Conversation

mumrah
Copy link
Member

@mumrah mumrah commented Jan 8, 2025

The default checkout behavior for GitHub Actions is to use a special merge ref which is equivalent to the base branch with the PR merged into it. While this is useful for checking compilation issues against trunk, it effectively breaks our ability to use any build caching.

This patch changes the build to checkout the HEAD commit of the PR when building. The "Compile and Check" step still checks out the merge commit so we can keep that level of validation.

@mumrah mumrah added the do-not-merge PRs that are only open temporarily and should not be merged label Jan 8, 2025
@github-actions github-actions bot added small Small PRs build Gradle build or GitHub Actions labels Jan 8, 2025
@mumrah mumrah force-pushed the tmp-check-caching branch from e68973f to 85a5fde Compare January 15, 2025 14:55
@mumrah mumrah force-pushed the tmp-check-caching branch from 2d485d2 to 8539eb0 Compare January 15, 2025 19:45
@mumrah
Copy link
Member Author

mumrah commented Jan 15, 2025

Ok, there is a fundamental problem here. The pull_request target is building the merge commit of this PR against the base rather than just the PR contents. This means, the build will include changes on trunk which have not yet been cached.

When trunk is moving quickly, our PRs will have little hope to benefit from much caching.

For example:

(trunk) HEAD --- A --- B --- C
(PR) HEAD --- X --- Y --- Z --- C

If commit C was the last trunk commit to be built, there will be Gradle cache files for that commit. Commits A and B are still building. If the PR was simply building X, this would be fine and we would expect cache hits for anything not changed by X, Y, Z. However, the pull_request event will result in a build of something totally different:

(merge) HEAD --- A --- B --- C
            `X --- Y --- Z --- C

So when the PR is built, it will be fetching the latest cache (C), but will include file changes from A and B in addition to the PR changes. This greatly increases cache misses.


I think the merge queue might be a solution to this. If we do a full build as part of the merge queue, then no code will land on trunk that has not been built, tested, and cached. The risk with this approach is that flaky builds will prevent things from getting into trunk.

@ijuma @dajac thoughts?

@ijuma
Copy link
Member

ijuma commented Jan 15, 2025

One way to workaround the flaky test issue would be to increase the number of retries for failed tests when running the build via the merge queue.

That said, we'd want to measure how much time we're taking with this extra time versus time saved in PRs themselves.

@mumrah
Copy link
Member Author

mumrah commented Jan 16, 2025

Ok, I was able to modify the build to check out the PR at HEAD instead of the merge commit. This increase our cache hits a lot.

The recent job finished in around 1 hour because it only had to run :core:test and :streams:test (since those had failures in the trunk job, and so were not cached)

@mumrah mumrah changed the title Checking on the gradle cache KAFKA-18500 Build PRs at HEAD commit Jan 16, 2025
@mumrah mumrah removed the do-not-merge PRs that are only open temporarily and should not be merged label Jan 16, 2025
@mumrah mumrah requested review from chia7712 and ijuma January 16, 2025 19:26
@mumrah
Copy link
Member Author

mumrah commented Jan 16, 2025

I validated the Build Scan status checks on my fork
image

That shows all four build scans being reported to the commit (which will appear in PRs)

@github-actions github-actions bot removed the small Small PRs label Jan 17, 2025
@mumrah
Copy link
Member Author

mumrah commented Jan 18, 2025

@ijuma WDYT about this approach?

@ijuma
Copy link
Member

ijuma commented Jan 18, 2025

I haven't reviewed the code changes - the approach looks good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Gradle build or GitHub Actions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants