top | item 29039386

(no title)

anonova | 4 years ago

A detailed analysis of Meow Hash: https://peter.website/meow-hash-cryptanalysis

It's not the highest of quality hash functions (see the SMHasher benchmarks), but it is fast. A great alternative is XXH3 (https://cyan4973.github.io/xxHash/), which has seen far more usage in practice.

discuss

order

jdcarter|4 years ago

I'm using XXHash3 for verifying data integrity of large (10+MB) blobs. Very fast and appears to work very well in my testing--it's never missed any bit error I've thrown at it.

Aside: when storing hashes, be sure to store the hash type as well so that you can change it later if needed, e.g. "xxh3-[hash value]". RFC-6920 also has things to say about storing hash types and values, although I haven't seen its format in common use.

njt|4 years ago

> be sure to store the hash type as well so that you can change it later if needed

Thanks for sharing this, I'd been doing this on my own for my own stuff (eg. foo.txt-xxh32-ea79e094), but it's good to know someone else has thought it through.

I ran into the problem once where someone had named some files foo-fb490c or something similar without any annotation, and when there was a problem, it took a file to figure out they were using truncated sha256 hashes.

DeathArrow|4 years ago

Useless analysys since the author says it's not a cryptographic hash but useful as a fast hash for change detection.

"we wanted a fast, non-cryptographic hash for use in change detection and deduplication"

>A great alternative is XXH3

Meow Hash is twice as fast.

MauranKilom|4 years ago

If you had made it one section into the analysis, you would have seen that at the time MeowHash made certain cryptographic claims that the author set out to disprove.

The readme has since been updated. I didn't check whether any algorithmic changes were made on top, but the discussion of the analysis on github didn't point to a lot of low-hanging fruit.

LeoPanthera|4 years ago

It's not useless analysis, because even for non-cryptographic hashes you want the likelihood of any arbitrary hash to be roughly equal. A hash function which "prefers" certain outputs has a far higher probability of collision.

IncRnd|4 years ago

Don't you think asset planting is an attack against a game's pipeline?

The author of the article's page claims the hash is not cryptographic but actually goes on to make security claims about the hash. People who do not understand cryptography should be careful about making such claims. The author appear to understand this more than your comment demonstrates.

For example, a claim about change detection is a cryptographic claim of detecting preimage attacks. In a threat model, a security professional would determine whether a first preimage or a second preimage attack is what should be guarded in attack scenarios. Then, the professional would help with analysis, determining mitigations, defense in depth, and prioritization of fixing the vulnerabilities exposed by how the hash is used.

A hash cannot be considered standalone. It is the architecture and use-case where the hash's security properties are used to determine what security properties of the application are fulfilled.

So, if the author is correct, which seems to be the case, then meowhash should not be used in a production environment outside of the most simplistic checks. It seems faster for its intended use case to simply check for a single bit difference between two images - no hash required.

jrochkind1|4 years ago

What determines whether a hash is "cryptographic"? What would make it suitable for change-detection but not be "cryptographic"? Is the claim here that it would not be suitable for detecting "malicious" changes, but is still suitable for detecting "natural" changes?

ncann|4 years ago

Saying it's twice as fast is rather misleading? They can both hash as fast as RAM speed allows anyway. And if it's something in cache I doubt one is significantly better than the other.

iratewizard|4 years ago

Meow hash is also written by people who initially thought SHA-1 was acceptable for large scale change hashing.

ilitirit|4 years ago

I can vouch for xxHash (it does what it says on the can), but I'm really curious to hear from who have experience with meow hash.