top | item 42410574

(no title)

mulle_nat | 1 year ago

I am using fnv-1a to hash Objective-C method selectors, which are generally just identifier characters and zero or multiple ':'. At the time of my research, fnv-1a had the least collisions over my set of "real life" selectors. I think, it could be worthwhile some time, to try out other constants for maybe even less collisions. Is your list of good primes available ? (And maybe also those that are not quite perfect)

discuss

order

keepamovin|1 year ago

Everything is in the source code. I highly doubt any of the good hash functions listed in smhasher3 (ie all tests passed) would collide over identifiers.

So they should all have zero collisions, meaning there’s no ‘least’ among the good quality ones - they’re all equally collissionless (they differ in other tests).

Sounds like an interesting project. What’s its purpose?

mulle_nat|1 year ago

Cool. I forgot to mention, that I am truncating the hash down to 32 bits to hopefully generate tighter CPU instructions. At these few bits, collisions are still rare enough, but they are a concern.

Now my understanding of the choice of prime is that, you are "weighing" the input bits and the computed bits, that will form the hash. So in case of identifiers its very likely that bit 7 of the input is always 0 and say maybe bit 4 is statistically more likely to be 1 by some margin. The other input bits would have some entropy as well. I would expect that certain (imperfect) primes would then aid me to get a better use of the 32 bit space and therefore less risk of a collision for my Objective-C runtime.

You can check out the project here: https://github.com/mulle-objc.

rurban|1 year ago

Spooky also has some good results on common identifiers.

But fnv-1a is in a completely different league. It's recommended for hash tables with other security measures than hash function security. This hash is a typical hybrid, but not universal. umash would be the perfect hybrid: insecure, pretty fast, passes all tests, universal

mulle_nat|1 year ago

Thanks for the umash tip.