Releases: fxamacker/circlehash
v0.3.0 (March 16, 2022)
What's Changed
CircleHash64f is stable and used in production. Given same input data and seed, it will produce same digest in future versions.
- Combine files and simplify file names by @fxamacker in #12
- Update circlehash64_test.go by @fxamacker in #13
- Update README.md and add a logo (circle with a hash symbol inside)
Full Changelog: v0.2.0...v0.3.0
What is CircleHash?
CircleHash is a family of modern non-cryptographic hash functions.
CircleHash64 is a 64-bit hash with a 64-bit seed. CircleHash64 is fast, simple, and easy to audit. It uses the fractional digits of π as default constants (nothing up my sleeve). It balances speed, digest quality, and maintainability.
CircleHash64 is based on Google's Abseil C++ library internal hash. CircleHash64 passes every test in SMHasher (demerphq/smhasher, rurban/smhasher, and a stricter private test suite).
Strict Avalanche Criterion (SAC)
CircleHash64 | Abseil C++ | SipHash-2-4 | XXH64 | |
---|---|---|---|---|
SAC worst-bit 0-128 byte inputs (lower % is better) |
0.791% 🥇 w/ 99 bytes |
0.862% w/ 67 bytes |
0.852% w/ 125 bytes |
0.832% w/ 113 bytes |
☝️ Using demerphq/smhasher updated to test all input sizes 0-128 bytes (SAC test will take hours longer to run).
Hashing Short Inputs With 64-bit Seed
CircleHash64 (seeded) |
XXH3 (seeded) |
XXH64 (w/o seed) |
SipHash (seeded) |
|
---|---|---|---|---|
4 bytes | 1.34 GB/s | 1.21 GB/s | 0.877 GB/s | 0.361 GB/s |
8 bytes | 2.70 GB/s | 2.41 GB/s | 1.68 GB/s | 0.642 GB/s |
16 bytes | 5.48 GB/s | 5.21 GB/s | 2.94 GB/s | 1.03 GB/s |
32 bytes | 8.01 GB/s | 7.08 GB/s | 3.33 GB/s | 1.46 GB/s |
64 bytes | 10.3 GB/s | 9.33 GB/s | 5.47 GB/s | 1.83 GB/s |
128 bytes | 12.8 GB/s | 11.6 GB/s | 8.22 GB/s | 2.09 GB/s |
192 bytes | 14.2 GB/s | 9.86 GB/s | 9.71 GB/s | 2.17 GB/s |
256 bytes | 15.0 GB/s | 8.19 GB/s | 10.2 GB/s | 2.22 GB/s |
- Using Go 1.17.7, darwin_amd64, i7-1068NG7 CPU.
- Fastest XXH64 (written in Go+Assembly) doesn't support seed.
CircleHash64 is ideal for short input sizes <= 512 bytes. It was created when a reliable fast hash for data typically <= 128 bytes was needed. Other designs can hash larger inputs faster at the cost of slower speed for shorter input sizes.
ℹ️ Non-cryptographic hashes should only be used in software designed to properly handle hash collisions. If you require a secure hash, please use a cryptographic hash (like the ones in SHA-3 standard).
v0.2.0 (Feb 24, 2022)
What's Changed
Seeded CircleHash64 reached 10GB/s at less than 64-byte inputs and 15GB/s at 256-byte inputs.
- Optimize CircleHash64 without using assembly language by @fxamacker in #8
- Update ci-go-cover.yml and add CircleHash128 to TODOs by @fxamacker in #9
Full Changelog: v0.1.0...v0.2.0
What is CircleHash?
CircleHash is a family of non-cryptographic hash functions. CircleHash64 uses the fractional digits of π as default constants (nothing up my sleeve). CircleHash64 is fast, simple, and easy to audit/maintain.
CircleHash64 uses CircleHash64f by default, which is based on Google's Abseil C++ library internal hash.
CircleHash64 | Abseil C++ | SipHash-2-4 | xxh64 | |
---|---|---|---|---|
SAC worst-bit 0-128 byte inputs (lower % is better) |
0.791% 🥇 w/ 99 bytes |
0.862% w/ 67 bytes |
0.852% w/ 125 bytes |
0.832% w/ 113 bytes |
☝️ Using demerphq/smhasher updated to test all input sizes 0-128 bytes (SAC test will take hours longer to run).
CircleHash64 is very fast at hashing short inputs with a 64-bit seed.
CircleHash64 (seeded) |
XXH3 (seeded) |
XXH64 (w/o seed) |
SipHash (seeded) |
|
---|---|---|---|---|
4 bytes | 1.34 GB/s | 1.21 GB/s | 0.877 GB/s | 0.361 GB/s |
8 bytes | 2.70 GB/s | 2.41 GB/s | 1.68 GB/s | 0.642 GB/s |
16 bytes | 5.48 GB/s | 5.21 GB/s | 2.94 GB/s | 1.03 GB/s |
32 bytes | 8.01 GB/s | 7.08 GB/s | 3.33 GB/s | 1.46 GB/s |
64 bytes | 10.3 GB/s | 9.33 GB/s | 5.47 GB/s | 1.83 GB/s |
128 bytes | 12.8 GB/s | 11.6 GB/s | 8.22 GB/s | 2.09 GB/s |
192 bytes | 14.2 GB/s | 9.86 GB/s | 9.71 GB/s | 2.17 GB/s |
256 bytes | 15.0 GB/s | 8.19 GB/s | 10.2 GB/s | 2.22 GB/s |
- Using Go 1.17.7, darwin_amd64, i7-1068NG7 CPU.
- Results from
go test -bench=. -count=20
andbenchstat
- Fastest XXH64 (written in Go+Assembly) doesn't support seed.
CircleHash64 was created when I needed a very fast hash for input sizes typically <= 128 bytes.
ℹ️ Non-cryptographic hashes should only be used in software designed to properly handle hash collisions. If you require a secure hash, please use a cryptographic hash (like the ones in SHA-3 standard).
Release v0.1.0 (October 25, 2021)
What's new?
- added
func Hash64Uint64x2(a uint64, b uint64, seed uint64) uint64)
- renamed
HashString64
toHash64String
to allow consistent naming with newly added func
Hash64Uint64x2
produces a 64-bit digest from a, b, and seed. The digest is compatible with Hash64
using a 16-byte input and same seed.
Speed comparison using Go 1.16.9 on darwin_amd64 (i7-1069NG7 CPU):
- 1.653 ns/op --
foo := Hash64Uint64x2(uint64a, uint64b, seed)
🆕 - 2.473 ns/op --
foo := uint64a % uint64b
Speed comparison using Go 1.16.9 on linux_amd64 (Xeon E3-1246 v3 @ 3.5 GHz):
- 2.006 ns/op --
foo := Hash64Uint64x2(uint64a, uint64b, seed)
🆕 - 8.555 ns/op --
foo := uint64a % uint64b
Release v0.0.2 (October 4, 2021)
What's new?
Update go.mod to lower requirement from go 1.16 to go 1.15 to match a project about to use CircleHash in production.
What is CircleHash?
CircleHash is a family of non-cryptographic hash functions that pass every test in SMHasher (both rurban/smhasher and demerphq/smhasher). Tests passed include Strict Avalanche Criterion, Bit Independence Criterion, and many others.
CircleHash uses the fractional digits of π as default constants (nothing up my sleeve). The code is simple and easy to audit. I tried to balance competing factors such as speed, digest quality, and maintainability.
CircleHash64 is based on Google's Abseil C++ library. 🚀 Unoptimized CircleHash64 is fast as Abseil C++ internal hash. CircleHash64 has reliable results for Strict Avalanche Criterion (SAC).
CircleHash64 | Abseil C++ | SipHash-2-4 | |
---|---|---|---|
SAC worst-bit 0-32 byte inputs (lower % is better) |
0.754% w/ 29 bytes |
0.829% w/ 22 bytes |
0.768% w/ 29 bytes |
☝️ Using demerphq/smhasher updated to test all input sizes 0-32 bytes (tests will take a lot longer to run).
Why CircleHash?
I wanted a very fast, maintainable, and easy-to-audit hash function that's free of backdoors and bugs.
It needed to pass all tests in both demerphq/smhasher and rurban/smhasher. It also needed to have sufficiently explained choice of default constants and avoid over-optimizations that increase risk of being affected by bad seeds or efficient seed-independent attacks.
Release v0.0.1 (Oct 4, 2021)
What's new?
Code coverage is at 100% and validation tests verify nearly 500,000 digests.
What is CircleHash?
CircleHash is a family of non-cryptographic hash functions that pass every test in SMHasher (both rurban/smhasher and demerphq/smhasher). Tests passed include Strict Avalanche Criterion, Bit Independence Criterion, and many others.
CircleHash uses the fractional digits of π as default constants (nothing up my sleeve). The code is simple and easy to audit. I tried to balance competing factors such as speed, digest quality, and maintainability.
CircleHash64 variants produce 64-bit digests and support 64-bit seeds. They are very fast and guaranteed to produce compatible digests within the same major release (SemVer 2.0).
CircleHash64 uses CircleHash64f by default, which is based on Google's Abseil C++ library. CircleHash64 has good results for Strict Avalanche Criterion (SAC).
CircleHash64 | Abseil C++ | wyhash_final3 | SipHash-2-4 | |
---|---|---|---|---|
SAC worst-bit 0-33 byte inputs (lower % is better) |
0.754% w/ 29 bytes |
0.829% w/ 22 bytes |
0.772% w/ 24 bytes |
0.768% w/ 29 bytes |
☝️ Using demerphq/smhasher updated to test all input sizes 0-33 bytes.
Why CircleHash?
I wanted a very fast, maintainable, and easy-to-audit hash function that's free of backdoors and bugs.
It needed to pass all tests in both demerphq/smhasher and rurban/smhasher. It also needed to have sufficiently explained choice of default constants and avoid over-optimizations that increase risk of being affected by bad seeds or efficient seed-independent attacks.