Replies: 1 comment 4 replies
-
This does speak to the raw speeds - the SipHash algorithm is slightly slower than xxhash, which shows when hashing large files. The fact that there is still an advantage for in-memory objects just means that serialization memory allocation dominates here. |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
https://eprint.iacr.org/2012/351.pdf says SipHash is designed for small data, and you did mention that you found it to be slower on large files. But the reprex below shows something peculiar:
siphash13()
is almost exactly twice as slow asdigest()
for a 750M file, but faster thandigest()
when the exact same object is already in memory. That makes me wonder if the findings below are due to file processing (a duplicated step?) rather than the underlying algorithm.If this can be solved, what is your sense about SipHash vs xxhash64 in terms of speed on files larger than 1 GB?
Created on 2024-03-26 with reprex v2.1.0](https://www.google.com/url?q=https://reprex.tidyverse.org)&sa=D&source=calendar&usd=2&usg=AOvVaw1YEtwbJvY4ZVOWctnYa30c)
Beta Was this translation helpful? Give feedback.
All reactions