(no title)
jlebar | 10 months ago
This technique is orthogonal to integer mod. Indeed the author multiplies by their magic constant and then does an integer mod to map into their hashtable's buckets.
This technique is actually just applying a fast integer hash on the input keys to the hashtable before mapping the keys to buckets. You can then map to buckets however you want.
The additional hash is useful if and only if the input hash function for your table's keys doesn't appear to be a random function, i.e. it doesn't mix its bits for whatever reason. If your input hash functions are indeed random then this is a (small but perhaps measurable) waste of time.
Using prime-numbered table sizes is another way to accomplish basically the same thing. Dividing the input hash key by a prime forces you to look at all the bits of the input. In practice these are written as division by a constant, so they use multiplies and shifts. It's basically a hash function. (Though I'd use multiply by a magic number over divide by a prime, mul alone should be faster.)
Relatedly see this post by Daniel Lemire about an alternative to integer mod, https://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-... which is interesting if your number of buckets is not a power of 2 for some reason.
sdenton4|10 months ago
So maybe one way of looking at this is as an efficient fusion operation, which doesn't look special when you look at the ops in isolation, but combine to something that is both fast and advised problems with input patterns.
ithkuil|10 months ago
I think there are some hashing functions around that are already designed to solve that problem at "step 1".
So the question just boils down to which is faster