top | item 40553706

(no title)

siilats | 1 year ago

It’s the easiest thing for intelligence agencies to scan all your messages. They just need to submit a few million fake “content id” hashes and automatically your phone will share the images that match. Nobody can tell if content id has is of a photo of a document or a photo of a person it’s just a 256byte hash. This is so easily abused. I bet the way it’s implemented it doesn’t have enough resolution to read text so one evil content id hash will match any photo of any document or screenshot you have taken. So essentially your WhatsApp client will send every screenshot of a text document to nsa.

discuss

stavros|1 year ago

A few million fake "content id" hashes still gives them a 1/100000000000000000000000000000000000000000000000000000000000000000000000000000 chance that any one of those hashes will match. 256 bits are a lot, besides, what are they going to do? Sift through random vacation photos that happened to match this "ten jackpots in a row" chance?

gorbypark|1 year ago

I think OP is saying that the algorithm is designed in such a way it can match “visually similar” photos/content. The idea is you can’t just rescale, crop or otherwise slightly change a photo as you could if this was just a regular SHA hash of the file. Now they are saying that it might be possible to create an “evil hash” of a document that could match a large percentage of documents, because the hash algorithm obviously doesn’t have enough bits to actually represent the content. So if you have a hash of “white document with some black text” (for example, if looking for image scans of documents) and add this to the db of “watched” hashes, you could in theory hoover up documents.

A quick search didn’t lead me to any proof of concepts about this idea but on the surface (I don’t have any knowledge of the hashing algorithm used in these content filters) it seems like a plausible idea, depending on a lot of factors.