That's an example of a fair number of accumulators that are stored as XXHash goes through its input buffer.
Many modern hash functions store more state/accumulators than they used to. Previous generations of hash functions would often just have one or two accumulators and run through the data. Many modern hash functions might even store multiple wider SIMD variables for better mixing.
And if you're storing enough state that it doesn't fit in your registers, the CPU will put it into the data cache.
_Fitting_ in the instruction cache isn't hard, but you'd also ideally let there be room for some other code as well :-) For a hash map lookup, where the hashing is frequently inlined a couple of times, code size matters.
Dylan16807|9 months ago
Yes, I'm talking about the data cache.
> But there are ways to make more or less efficient use of the data cache.
How?
You need to touch every byte of input, and nothing should be faster than going right through from start to end.
benwills|9 months ago
This is a minor example, but since you asked...
https://github.com/Cyan4973/xxHash/blob/dev/xxhash.h#L6432
That's an example of a fair number of accumulators that are stored as XXHash goes through its input buffer.
Many modern hash functions store more state/accumulators than they used to. Previous generations of hash functions would often just have one or two accumulators and run through the data. Many modern hash functions might even store multiple wider SIMD variables for better mixing.
And if you're storing enough state that it doesn't fit in your registers, the CPU will put it into the data cache.
Sesse__|9 months ago