Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use xxhash instead of sha256 for hashing AST nodes #6192

Merged
merged 1 commit into from
Dec 9, 2024

Conversation

swiatekm
Copy link
Contributor

@swiatekm swiatekm commented Dec 3, 2024

What does this PR do?

We currently use sha256 for hashing AST nodes when generating configuration. This is used both for checking equality between AST instances and for avoiding recomputing inputs unnecessarily. This PR changes this to xxHash, which is much faster. It also changes the interface, letting the caller supply their own hasher instance, and avoiding needing to allocate one for each Node in the tree.

Of note is that I'm using the xxhash.Digest struct as an argument instead of the generic hash.Hash64 interface from the standard library. The reason is that the former has an optimized WriteString method, which avoid needing to cast the string to a byte slice. I also don't expect to need to actually supply different hash implementations to this method.

Note also that there's already fairly exhaustive tests for this, some unexpected behaviour included.

Why is it important?

It's a pretty significant performance improvement. See the benchstat results using the benchmark from #6180:

goos: linux
goarch: amd64
pkg: github.com/elastic/elastic-agent/internal/pkg/agent/application/coordinator
cpu: 13th Gen Intel(R) Core(TM) i7-13700H
                                      │ bench_main.txt │           bench_hash.txt            │
                                      │     sec/op     │    sec/op     vs base               │
Coordinator_generateComponentModel-20     37.99m ± 24%   34.29m ± 14%  -9.73% (p=0.009 n=10)

                                      │ bench_main.txt │           bench_hash.txt            │
                                      │      B/op      │     B/op      vs base               │
Coordinator_generateComponentModel-20     34.22Mi ± 0%   31.53Mi ± 0%  -7.88% (p=0.000 n=10)

                                      │ bench_main.txt │           bench_hash.txt            │
                                      │   allocs/op    │  allocs/op   vs base                │
Coordinator_generateComponentModel-20      810.7k ± 0%   713.0k ± 0%  -12.05% (p=0.000 n=10)

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in ./changelog/fragments using the changelog tool

Related issues

@swiatekm swiatekm added enhancement New feature or request Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team backport-8.x Automated backport to the 8.x branch with mergify backport-8.16 Automated backport with mergify backport-8.17 Automated backport with mergify labels Dec 3, 2024
@swiatekm swiatekm requested a review from a team as a code owner December 3, 2024 15:09
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

Copy link
Contributor

mergify bot commented Dec 3, 2024

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b chore/config-gen-optimize-hashing upstream/chore/config-gen-optimize-hashing
git merge upstream/main
git push upstream chore/config-gen-optimize-hashing

@swiatekm swiatekm force-pushed the chore/config-gen-optimize-hashing branch from 69dde06 to d8326e3 Compare December 3, 2024 15:14
@swiatekm swiatekm changed the title Chore/config gen optimize hashing Use xxhash instead of sha256 for hashing AST nodes Dec 3, 2024
@swiatekm swiatekm force-pushed the chore/config-gen-optimize-hashing branch from d8326e3 to d7220b4 Compare December 3, 2024 15:30
# Conflicts:
#	internal/pkg/agent/transpiler/ast.go
@swiatekm swiatekm force-pushed the chore/config-gen-optimize-hashing branch from d7220b4 to 5b699bf Compare December 3, 2024 16:36
@@ -58,6 +59,9 @@ type Node interface {
// Hash compute a sha256 hash of the current node and recursively call any children.
Hash() []byte

// Hash64With recursively computes the given hash for the Node and its children
Hash64With(h *xxhash.Digest) error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just replace the usage of Hash()? Why do you have both? Do we need both?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need both. If there were a bunch of callers of Hash, then I'd be in favor of keeping it and implementing it using Hash64With, but there aren't, so we can get rid of it. I was planning to do so in a follow up.

Copy link
Contributor

@blakerouse blakerouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, removal in follow-up is good.

@swiatekm swiatekm merged commit 9c13110 into main Dec 9, 2024
14 checks passed
@swiatekm swiatekm deleted the chore/config-gen-optimize-hashing branch December 9, 2024 17:04
mergify bot pushed a commit that referenced this pull request Dec 9, 2024
# Conflicts:
#	internal/pkg/agent/transpiler/ast.go

(cherry picked from commit 9c13110)
mergify bot pushed a commit that referenced this pull request Dec 9, 2024
# Conflicts:
#	internal/pkg/agent/transpiler/ast.go

(cherry picked from commit 9c13110)

# Conflicts:
#	internal/pkg/agent/transpiler/ast.go
mergify bot pushed a commit that referenced this pull request Dec 9, 2024
# Conflicts:
#	internal/pkg/agent/transpiler/ast.go

(cherry picked from commit 9c13110)

# Conflicts:
#	go.mod
#	internal/pkg/agent/transpiler/ast.go
swiatekm added a commit that referenced this pull request Dec 9, 2024
# Conflicts:
#	internal/pkg/agent/transpiler/ast.go

(cherry picked from commit 9c13110)

# Conflicts:
#	go.mod
#	internal/pkg/agent/transpiler/ast.go
swiatekm added a commit that referenced this pull request Dec 9, 2024
# Conflicts:
#	internal/pkg/agent/transpiler/ast.go

(cherry picked from commit 9c13110)

Co-authored-by: Mikołaj Świątek <[email protected]>
swiatekm added a commit that referenced this pull request Dec 10, 2024
# Conflicts:
#	internal/pkg/agent/transpiler/ast.go

(cherry picked from commit 9c13110)

# Conflicts:
#	go.mod
#	internal/pkg/agent/transpiler/ast.go

# Conflicts:
#	internal/pkg/agent/transpiler/ast.go
swiatekm added a commit that referenced this pull request Dec 10, 2024
# Conflicts:
#	internal/pkg/agent/transpiler/ast.go

(cherry picked from commit 9c13110)

# Conflicts:
#	go.mod
#	internal/pkg/agent/transpiler/ast.go

# Conflicts:
#	internal/pkg/agent/transpiler/ast.go
swiatekm added a commit that referenced this pull request Dec 10, 2024
# Conflicts:
#	internal/pkg/agent/transpiler/ast.go

(cherry picked from commit 9c13110)

# Conflicts:
#	go.mod
#	internal/pkg/agent/transpiler/ast.go

# Conflicts:
#	internal/pkg/agent/transpiler/ast.go
swiatekm added a commit that referenced this pull request Dec 10, 2024
# Conflicts:
#	internal/pkg/agent/transpiler/ast.go

(cherry picked from commit 9c13110)

# Conflicts:
#	go.mod
#	internal/pkg/agent/transpiler/ast.go

# Conflicts:
#	internal/pkg/agent/transpiler/ast.go

Co-authored-by: Mikołaj Świątek <[email protected]>
swiatekm added a commit that referenced this pull request Dec 13, 2024
# Conflicts:
#	internal/pkg/agent/transpiler/ast.go

(cherry picked from commit 9c13110)

# Conflicts:
#	internal/pkg/agent/transpiler/ast.go
swiatekm added a commit that referenced this pull request Dec 13, 2024
# Conflicts:
#	internal/pkg/agent/transpiler/ast.go

(cherry picked from commit 9c13110)

# Conflicts:
#	internal/pkg/agent/transpiler/ast.go

# Conflicts:
#	internal/pkg/agent/transpiler/ast.go
swiatekm added a commit that referenced this pull request Dec 16, 2024
# Conflicts:
#	internal/pkg/agent/transpiler/ast.go

(cherry picked from commit 9c13110)

# Conflicts:
#	internal/pkg/agent/transpiler/ast.go

# Conflicts:
#	internal/pkg/agent/transpiler/ast.go
swiatekm added a commit that referenced this pull request Dec 17, 2024
# Conflicts:
#	internal/pkg/agent/transpiler/ast.go

(cherry picked from commit 9c13110)

# Conflicts:
#	internal/pkg/agent/transpiler/ast.go

# Conflicts:
#	internal/pkg/agent/transpiler/ast.go
swiatekm added a commit that referenced this pull request Dec 17, 2024
# Conflicts:
#	internal/pkg/agent/transpiler/ast.go

(cherry picked from commit 9c13110)

# Conflicts:
#	internal/pkg/agent/transpiler/ast.go

# Conflicts:
#	internal/pkg/agent/transpiler/ast.go

Co-authored-by: Mikołaj Świątek <[email protected]>
orestisfl added a commit to orestisfl/elastic-agent that referenced this pull request Jan 24, 2025
orestisfl added a commit to orestisfl/elastic-agent that referenced this pull request Jan 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-8.x Automated backport to the 8.x branch with mergify backport-8.16 Automated backport with mergify backport-8.17 Automated backport with mergify enhancement New feature or request Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants