Skip to content

Releases: fxamacker/circlehash

v0.3.0 (March 16, 2022)

16 Mar 05:39
0229119
Compare
Choose a tag to compare

What's Changed

CircleHash64f is stable and used in production. Given same input data and seed, it will produce same digest in future versions.

  • Combine files and simplify file names by @fxamacker in #12
  • Update circlehash64_test.go by @fxamacker in #13
  • Update README.md and add a logo (circle with a hash symbol inside)

Full Changelog: v0.2.0...v0.3.0

What is CircleHash?

CircleHash is a family of modern non-cryptographic hash functions.

CircleHash64 is a 64-bit hash with a 64-bit seed. CircleHash64 is fast, simple, and easy to audit. It uses the fractional digits of π as default constants (nothing up my sleeve). It balances speed, digest quality, and maintainability.

CircleHash64 is based on Google's Abseil C++ library internal hash. CircleHash64 passes every test in SMHasher (demerphq/smhasher, rurban/smhasher, and a stricter private test suite).

Strict Avalanche Criterion (SAC)

CircleHash64 Abseil C++ SipHash-2-4 XXH64
SAC worst-bit
0-128 byte inputs
(lower % is better)
0.791% 🥇
w/ 99 bytes
0.862%
w/ 67 bytes
0.852%
w/ 125 bytes
0.832%
w/ 113 bytes

☝️ Using demerphq/smhasher updated to test all input sizes 0-128 bytes (SAC test will take hours longer to run).

Hashing Short Inputs With 64-bit Seed

CircleHash64
(seeded)
XXH3
(seeded)
XXH64
(w/o seed)
SipHash
(seeded)
4 bytes 1.34 GB/s 1.21 GB/s 0.877 GB/s 0.361 GB/s
8 bytes 2.70 GB/s 2.41 GB/s 1.68 GB/s 0.642 GB/s
16 bytes 5.48 GB/s 5.21 GB/s 2.94 GB/s 1.03 GB/s
32 bytes 8.01 GB/s 7.08 GB/s 3.33 GB/s 1.46 GB/s
64 bytes 10.3 GB/s 9.33 GB/s 5.47 GB/s 1.83 GB/s
128 bytes 12.8 GB/s 11.6 GB/s 8.22 GB/s 2.09 GB/s
192 bytes 14.2 GB/s 9.86 GB/s 9.71 GB/s 2.17 GB/s
256 bytes 15.0 GB/s 8.19 GB/s 10.2 GB/s 2.22 GB/s
  • Using Go 1.17.7, darwin_amd64, i7-1068NG7 CPU.
  • Fastest XXH64 (written in Go+Assembly) doesn't support seed.

CircleHash64 is ideal for short input sizes <= 512 bytes. It was created when a reliable fast hash for data typically <= 128 bytes was needed. Other designs can hash larger inputs faster at the cost of slower speed for shorter input sizes.

ℹ️ Non-cryptographic hashes should only be used in software designed to properly handle hash collisions. If you require a secure hash, please use a cryptographic hash (like the ones in SHA-3 standard).

v0.2.0 (Feb 24, 2022)

25 Feb 04:23
ac446d1
Compare
Choose a tag to compare

What's Changed

Seeded CircleHash64 reached 10GB/s at less than 64-byte inputs and 15GB/s at 256-byte inputs.

  • Optimize CircleHash64 without using assembly language by @fxamacker in #8
  • Update ci-go-cover.yml and add CircleHash128 to TODOs by @fxamacker in #9

Full Changelog: v0.1.0...v0.2.0

What is CircleHash?

CircleHash is a family of non-cryptographic hash functions. CircleHash64 uses the fractional digits of π as default constants (nothing up my sleeve). CircleHash64 is fast, simple, and easy to audit/maintain.

CircleHash64 uses CircleHash64f by default, which is based on Google's Abseil C++ library internal hash.

CircleHash64 Abseil C++ SipHash-2-4 xxh64
SAC worst-bit
0-128 byte inputs
(lower % is better)
0.791% 🥇
w/ 99 bytes
0.862%
w/ 67 bytes
0.852%
w/ 125 bytes
0.832%
w/ 113 bytes

☝️ Using demerphq/smhasher updated to test all input sizes 0-128 bytes (SAC test will take hours longer to run).

CircleHash64 is very fast at hashing short inputs with a 64-bit seed.

CircleHash64
(seeded)
XXH3
(seeded)
XXH64
(w/o seed)
SipHash
(seeded)
4 bytes 1.34 GB/s 1.21 GB/s 0.877 GB/s 0.361 GB/s
8 bytes 2.70 GB/s 2.41 GB/s 1.68 GB/s 0.642 GB/s
16 bytes 5.48 GB/s 5.21 GB/s 2.94 GB/s 1.03 GB/s
32 bytes 8.01 GB/s 7.08 GB/s 3.33 GB/s 1.46 GB/s
64 bytes 10.3 GB/s 9.33 GB/s 5.47 GB/s 1.83 GB/s
128 bytes 12.8 GB/s 11.6 GB/s 8.22 GB/s 2.09 GB/s
192 bytes 14.2 GB/s 9.86 GB/s 9.71 GB/s 2.17 GB/s
256 bytes 15.0 GB/s 8.19 GB/s 10.2 GB/s 2.22 GB/s
  • Using Go 1.17.7, darwin_amd64, i7-1068NG7 CPU.
  • Results from go test -bench=. -count=20 and benchstat
  • Fastest XXH64 (written in Go+Assembly) doesn't support seed.

CircleHash64 was created when I needed a very fast hash for input sizes typically <= 128 bytes.

ℹ️ Non-cryptographic hashes should only be used in software designed to properly handle hash collisions. If you require a secure hash, please use a cryptographic hash (like the ones in SHA-3 standard).

Release v0.1.0 (October 25, 2021)

26 Oct 00:48
f6cc188
Compare
Choose a tag to compare

What's new?

  • added func Hash64Uint64x2(a uint64, b uint64, seed uint64) uint64)
  • renamed HashString64 to Hash64String to allow consistent naming with newly added func

Hash64Uint64x2 produces a 64-bit digest from a, b, and seed. The digest is compatible with Hash64 using a 16-byte input and same seed.

Speed comparison using Go 1.16.9 on darwin_amd64 (i7-1069NG7 CPU):

  • 1.653 ns/op -- foo := Hash64Uint64x2(uint64a, uint64b, seed) 🆕
  • 2.473 ns/op -- foo := uint64a % uint64b

Speed comparison using Go 1.16.9 on linux_amd64 (Xeon E3-1246 v3 @ 3.5 GHz):

  • 2.006 ns/op -- foo := Hash64Uint64x2(uint64a, uint64b, seed) 🆕
  • 8.555 ns/op -- foo := uint64a % uint64b

Release v0.0.2 (October 4, 2021)

04 Oct 17:57
d31f5fb
Compare
Choose a tag to compare

What's new?

Update go.mod to lower requirement from go 1.16 to go 1.15 to match a project about to use CircleHash in production.

What is CircleHash?

CircleHash is a family of non-cryptographic hash functions that pass every test in SMHasher (both rurban/smhasher and demerphq/smhasher). Tests passed include Strict Avalanche Criterion, Bit Independence Criterion, and many others.

CircleHash uses the fractional digits of π as default constants (nothing up my sleeve). The code is simple and easy to audit. I tried to balance competing factors such as speed, digest quality, and maintainability.

CircleHash64 is based on Google's Abseil C++ library. 🚀 Unoptimized CircleHash64 is fast as Abseil C++ internal hash. CircleHash64 has reliable results for Strict Avalanche Criterion (SAC).

CircleHash64 Abseil C++ SipHash-2-4
SAC worst-bit
0-32 byte inputs
(lower % is better)
0.754%
w/ 29 bytes
0.829%
w/ 22 bytes
0.768%
w/ 29 bytes

☝️ Using demerphq/smhasher updated to test all input sizes 0-32 bytes (tests will take a lot longer to run).

Why CircleHash?

I wanted a very fast, maintainable, and easy-to-audit hash function that's free of backdoors and bugs.

It needed to pass all tests in both demerphq/smhasher and rurban/smhasher. It also needed to have sufficiently explained choice of default constants and avoid over-optimizations that increase risk of being affected by bad seeds or efficient seed-independent attacks.

Release v0.0.1 (Oct 4, 2021)

04 Oct 15:10
2b11b2e
Compare
Choose a tag to compare

What's new?

Code coverage is at 100% and validation tests verify nearly 500,000 digests.

What is CircleHash?

CircleHash is a family of non-cryptographic hash functions that pass every test in SMHasher (both rurban/smhasher and demerphq/smhasher). Tests passed include Strict Avalanche Criterion, Bit Independence Criterion, and many others.

CircleHash uses the fractional digits of π as default constants (nothing up my sleeve). The code is simple and easy to audit. I tried to balance competing factors such as speed, digest quality, and maintainability.

CircleHash64 variants produce 64-bit digests and support 64-bit seeds. They are very fast and guaranteed to produce compatible digests within the same major release (SemVer 2.0).

CircleHash64 uses CircleHash64f by default, which is based on Google's Abseil C++ library. CircleHash64 has good results for Strict Avalanche Criterion (SAC).

CircleHash64 Abseil C++ wyhash_final3 SipHash-2-4
SAC worst-bit
0-33 byte inputs
(lower % is better)
0.754%
w/ 29 bytes
0.829%
w/ 22 bytes
0.772%
w/ 24 bytes
0.768%
w/ 29 bytes

☝️ Using demerphq/smhasher updated to test all input sizes 0-33 bytes.

Why CircleHash?

I wanted a very fast, maintainable, and easy-to-audit hash function that's free of backdoors and bugs.

It needed to pass all tests in both demerphq/smhasher and rurban/smhasher. It also needed to have sufficiently explained choice of default constants and avoid over-optimizations that increase risk of being affected by bad seeds or efficient seed-independent attacks.