top | item 28219068

Hash collision in Apple NeuralHash model

1389 points| sohkamyung | 4 years ago |github.com | reply

696 comments

order
[+] topynate|4 years ago|reply
Second preimage attacks are trivial because of how the algorithm works. The image goes through a neural network (one to which everyone has access), the output vector is put through a linear transformation, and that vector is binarized, then cryptographically hashed. It's trivial to perturb any image you might wish so as to be close to the original output vector. This will result in it having the same binarization, hence the same hash. I believe the neural network is a pretty conventional convolutional one, so adversarial perturbations will exist that are invisible to the naked eye.

This is useful for two purposes I can think of. One, you can randomize all the vectors on all of your images. Two, you can make problems for others by giving them harmless-looking images that have been cooked to give particular hashes. I'm not sure how bad those problems would be – at some point a police officer does have to look at the image in order to get probable cause. Perhaps it could lead to your Apple account being suspended, however.

[+] Kalium|4 years ago|reply
A police raid on a person's home, or even a gentler thorough search, can be enough to quite seriously disrupt a person's life. Certainly having the police walk away with all your electronics in evidence bags will complicate trying to work remotely.

Of course, this is assuming everything works as intended and they don't find anything else they can use to charge you with something as they search your home. If you smoke cannabis while being in the wrong state, you're now in several more kinds of trouble.

[+] e_proxus|4 years ago|reply
Wouldn’t it also just be possible to turn a jailbroken iDevice into a CSAM cleaner/hider?

You could take actual CSAM, check if it matches the hashes and keep modifying the material until it doesn’t (adding borders, watermarking, changing dimensions etc.). Then just save it as usual without any risk.

[+] tambourine_man|4 years ago|reply
Neuralhashes are far from my area of expertise, but I've been following Apple closely ever since its foundation and have probably watched every public video of Craig since the NeXT take over and here is my take: I've never seen him so off balance before as in his latest interview with Joanna Stern. Not even in the infamous “shaking mouse hand close up” of the early days.

Whatever you say about Apple, they are an extremely well oiled communication machine. Every C-level phrase has a well thought out message to deliver.

This interview was a train wreck. Joanna kept asking: please, in simple terms, to a hesitant and inarticulate Craig. It was so bad that she had to produce infographics to fill the communication void left by Apple.

They usually do their best to “take control” of the narrative. They were clearly caught way off guard here. And that's revealing.

[+] cwizou|4 years ago|reply
I think they clearly didn't anticipate that people would perceive it as anything but a breach of trust, that their device was working against them (even for a good cause, against the worst people).

And because of this they calibrated their communication completely wrong, focusing on the on device part as being more private. Using the same line of thinking they use for putting Siri on device.

And the follow up was an uncoordinated mess that didn't help either (as you rightly pointed out with Craig's interview). In the Neuenschwander interview [1], he stated this :

> The hash list is built into the operating system, we have one global operating system and don’t have the ability to target updates to individual users and so hash lists will be shared by all users when the system is enabled.

This still has me confused, here's my understanding so far (please feel free to correct me)

- Apple is shipping a neural network trained on the dataset that generates NeuralHashes

- Apple also ships (where ?) a "blinded" (by an eliptic curve algo) table lookup that match (all possible?!) NeuralHashes to a key

- This key is used to encrypt the NeuralHash and the derivative image (that would be used by the manual review) and this bundle is called the voucher

- A final check is done on server using the secret used to generate the elliptic curve to reverse the NeuralHash and check it server side against the known database

- If 30 or more are detected, decrypt all vouchers and send the derivative images to manual review.

I think I'm missing something regarding the blinded table as I don't see what it brings to the table in that scenario, apart from adding a complex key generation for the vouchers. If that table only contained the NeuralHashes of known CSAM images as keys, that would be as good as giving the list to people knowing the model is easily extracted. And if it's not a table lookup but just a cryptographic function, I don't see where the blinded table is coming from in Apple's documentation [2].

Assuming above assumptions are correct, I'm paradoxically feeling a tiny bit better about that system on a technical level (I still think doing anything client side is a very bad precedent), but what a mess did they put themselves into.

Had they done this purely server side (and to be frank there's not much difference, the significant part seems to be done server side) this would have been a complete non-event.

[1] : https://daringfireball.net/linked/2021/08/11/panzarino-neuen...

[2] This is my understanding based on the repository and what's written page 6-7 : https://www.apple.com/child-safety/pdf/CSAM_Detection_Techni...

[+] FabHK|4 years ago|reply
How can you use it for targeted attacks?

This is what would need to happen:

1. Attacker generates images that collide with known CSAM material in the database (the NeuralHashes of which, unless I'm mistaken, are not available)

2. Attacker sends that to innocent person

3. Innocent person accepts and stores the picture

4. Actually, need to run step 1-3 at least 30 times

5. Innocent person has iCloud syncing enabled

6. Apple's CSAM detection then flags these, and they're manually reviewed

7. Apple reviewer confuses a featureless blob of gray with CSAM material, several times

Note that other cloud providers have been scanning uploaded photos for years. What has changed wrt targeted attacks against innocent people?

[+] fsloth|4 years ago|reply
"How can you use it for targeted attacks?"

Just insert a known CSAM image on target's device. Done.

I presume this could be used against a rival political party to ruin their reputation - insert bunch of CSAM images on their devices. "Party X is revealed as an abuse ring". This goes oh-so-very-nicely with Qanon conspiracy theories which even don't require any evidence to propagate widely.

Wait for Apple to find the images. When police investigation is opened, make it very public. Start a social media campaign at the same time.

It's enough to fabricate evidence only for a while - the public perception of the individual or the group will be perpetually altered, even though it would surface later that the CSAM material was inserted by hostile third party.

You have to think about what nation state entities that are now clients of Pegasus and so on could do with this. Not how safe the individual component is.

[+] y7|4 years ago|reply
Cross-posting from another thread [1]:

1. Obtain known CSAM that is likely in the database and generate its NeuralHash.

2. Use an image-scaling attack [2] together with adversarial collisions to generate a perturbed image such that its NeuralHash is in the database and its image derivative looks like CSAM.

A difference compared to server-side CSAM detection could be that they verify the entire image, and not just the image derivative, before notifying the authorities.

[1] https://news.ycombinator.com/item?id=28218922

[2] https://bdtechtalks.com/2020/08/03/machine-learning-adversar...

[+] st_goliath|4 years ago|reply
> 7. Apple reviewer....

This part IMO makes Apple itself the most likely "target", but for a different kind of attack.

Just wait until someone who wasn't supposed to, somewhere, somehow gets their hands on some of the actual hashes (IMO bound to happen eventually). Also remember that with Apple, we now have an oracle that can tell us. And with all the media attention around the issue, this might further incentivize people to try.

From that I can picture a chain of events something like this:

1. Somebody writes a script that generates pre-image collisions like in the post, but for actual hashes Apple uses.

2. The script ends up on the Internet. News reporting picks it up and it spreads around a little. This also means trolls get their hands on it.

3. Tons of colliding image are created by people all over the planet and sent around to even more people. Not for targeted attacks, but simply for the lulz.

4. Newer scripts show up eventually, e.g. for perturbing existing images or similar stunts. More news reporting follows, accelerating the effect and possibly also spreading perturbed images around themselves. Perturbed images (cat pictures, animated gifs, etc...) get uploaded to places like 9gag, reaching large audiences.

5. Repeat steps 1-4 until the Internet and the news grow bored with it.

During that entire process, potentially each of those images that ends up on an iDevice will have to be manually reviewed...

[+] the8472|4 years ago|reply
> 7. Apple reviewer confuses a featureless blob of gray with CSAM material, several times

A better collision won't be a grey blob, it'll take some photoshopped and downscaled picture of a kid and massage the least significant bits until it is a collision.

https://openai.com/blog/adversarial-example-research/

[+] thinkingemote|4 years ago|reply
the issue is step 6 - review and action

Every single tech company is getting rid of manual human review towards an AI based approach. Human-ops they call it - they dont want their employees to be doing this harmful work, plus computers are cheaper and better at

We hear about failures of inhuman ops all the time on HN. people being banned, falsely accused, cancelled, accounts locked, credit denied. All because the decisions which were once by humans are now made by machine. This will happen eventually here too.

It's the very reason why they have the neuralhash model. To remove the human reviewer.

[+] mannerheim|4 years ago|reply
> 7. Apple reviewer confuses a featureless blob of gray with CSAM material, several times

Just because the PoC used a meaningless blob doesn't mean that collisions have to be those. Plenty of examples of adversarial attacks on image recognition perturb real images to get the network to misidentify them, but to a human eye the image is unchanged.

[+] jsdalton|4 years ago|reply
For #4, I know for a fact that my wife’s WhatsApp automatically stores pictures you send her to her iCloud. So the grey blob would definitely be there unless she actively deleted it.
[+] gambiting|4 years ago|reply
I don't know why you'd even go through this trouble. At least few years ago finding actual CP on TOR was trivial, not sure if the situation has changed or not. If you're going to blackmail someone, just send actual illegal data, not something that might trigger detection scanners.

>> What has changed wrt targeted attacks against innocent people?

Anecdote: every single iphone user I know has iCloud sync enabled by default. Every single Android user I know doesn't have google photos sync enabled by default.

[+] Nextgrid|4 years ago|reply
> the NeuralHashes of which, unless I'm mistaken, are not available

Given the scanning is client-side wouldn't the client need a list of those hashes to check against? If so it's just a matter of time before those are extracted and used in these attacks.

[+] cm2187|4 years ago|reply
Don’t imessage and whatsapp automatically store all images received in the iphone’s photo library?
[+] anonymousab|4 years ago|reply
> 7. Apple reviewer confuses a featureless blob of gray with CSAM material, several times

I find it hard to believe that anyone has faith in any purported manual review by a modern tech giant. Assume the worst and you'll still probably not go far enough.

[+] visarga|4 years ago|reply
How can we know that the CSAM database is not already poisoned with adversarial images that actually target other kinds of content for different purposes? It would look like CSAM to the naked eye, and nobody can tell the images have been doctored.

When reports come in the images would not match, so they need to intercept them before they are discarded by Apple, maybe by having a mole in the team. But it's so much easier than other ways to have an iOS platform scanner for any purpose. Just let them find the doctored images and add them to the database and recruit a person in the Apple team.

[+] johnla|4 years ago|reply
I don't think this can be used to harm an innocent person. It can raise a red flag but it would be quickly unraised and perhaps an investigation into the source of the fakeout images because THAT person had to have had the real images in possession.

If anything, this gives weapons to people against the scanner as we can now bomb the system with false positives rendering it impossible to use. I don't know enough about cryptography but I wonder if there is any ramifications of the hash being broken.

[+] beiller|4 years ago|reply
Maybe they could install malware that makes all camera images taken using a technique like stenography to cause false positive matches for all the photos taken by the device. Maybe they could share one photo album where all the images are hash collisions.
[+] Johnny555|4 years ago|reply
Actually, need to run step 1-3 at least 30 times

You can do steps 2-3 all in one step "Hey Bob, here's a zip file of those funny cat pictures I was telling you about. Some of the files got corrupted and are grayed out for some reason".

[+] f3d46600-b66e|4 years ago|reply
What makes CSAM database private?

It's my understanding that many tech companies (Microsoft? Dropbox? Google? Apple? Other?) (and many people in those companies) have access to the CSAM database, which essentially makes it public.

[+] cirrus3|4 years ago|reply
Are you being serious? #7 is literally "Apple reviewer confuses a featureless blob of gray with CSAM material, several times"

30 times.

30 times a human confused a blob with CSAM?

[+] spicybright|4 years ago|reply
If you're in close physical contact with a person (like at a job) you just wait for them to put their phone down while unlocked, and do all this.
[+] Jcowell|4 years ago|reply
one vector you can use to skip step 3 is to send on WhatsApp. I believe images sent via WhatsApp are auto saved by default last I recalled.
[+] xucheng|4 years ago|reply
> 4. Actually, need to run step 1-3 at least 30 times

Depending on how the secret sharing is used in Apple PSI, it may be possible that duplicating the same image 30 times would be enough.

[+] eptcyka|4 years ago|reply
I'm sure the reviewers will definitely be able to give each reported image enough time and attention they need, much like the people youtube employs to review videos discussing and exposing animal abuse, holocaust denial and other controversial topics. </sarcasm>
[+] soziawa|4 years ago|reply
> 6. Apple's CSAM detection then flags these, and they're manually reviewed

Is the process actually documented anywhere? Afaik they are just saying that they are verifying a match. This could of course just be a person looking at the hash itself.

[+] halflings|4 years ago|reply
Apple's scheme includes operators manually verifying a low-res version of each image matching CSAM databases before any intervention. Of course, grey noise will never pass for CSAM and will fail that step.

The fact that you can randomly manipulate random noise until it matches the hash of an arbitrary image is not surprising. The real challenge is generating a real image that could be mistaken for CSAM at low res + is actually benign (or else just send CSAM directly) + matches the hash of real CSAM.

This is why SHAttered [1] was such a big deal, but daily random SHA collisions aren't.

[1] https://shattered.io/

[+] lifthrasiir|4 years ago|reply
But you can essentially perform DoS attack to human checkers, effectively rendering the entire system grind to a halt. The entire system is too reliant on the performance of NeuralHash which can be defaced in many ways. [1]

(Added later:) I should note that the DoS attack is only possible with the preimage attack and not the second preimage attack as the issue seemingly suggests, because you need the original CSAM to perform the second preimage attack. But given the second preimage attack is this easy, I don't have any hope for the preimage resistance anyway.

(Added much later:) And I realized that Apple did think of this possibility and only stores blinded hashes in the device, so the preimage attack doesn't really work as is. But it seems that the hash output is only 96 bits long according to the repository, so this attack might still be possible albeit with much higher computational cost.

[1] To be fair, I don't think that Apple's claim of 1/1,000,000,000,000 false positive rate refers to that of the algorithm. Apple probably tweaked the threshold for manual checking to match that target rate, knowing NeuralHash's false positive rate under the normal circumstances. Of course we know that there is no such thing like the normal circumstances.

[+] Majromax|4 years ago|reply
> The fact that you can randomly manipulate random noise until it matches the hash of an arbitrary image is not surprising.

It is, actually. Remember that hashes are supposed to be many-bit digests of the original; it should take O(2^256) work to find a message with a chosen 256-bit hash and O(2^128) work to find a "birthday attack" collision. Finding any collision at all with NeuralHash so soon after its release is very surprising, suggesting the algorithm is not very strong.

SHAttered is a big deal because it is a fully working attack model, but the writing was on the wall for SHA-1 after the collisions were found in reduced-round variations of the hash. Attacks against an algorithm only get better with time, never worse.

Moreover, the break of NeuralHash may be even stronger than the SHAttered attack. The latter modifies two documents to produce a collision, but the NeuralHash collision here may be a preimage attack. It's not clear if the attacker crafted both images to produce the collision or just the second one.

[+] Edd314159|4 years ago|reply
I think I am in dire need of some education here and so I have questions:

* Is this a problem with Apple's CSAM discriminator engine or with the fact that it's happening on-device?

* Would this attack not be possible if scanning was instead happening in the cloud, using the same model?

* Are other services (Google Photos, Facebook, etc.) that store photos in the cloud not doing something similar to uploaded photos, with models that may be similarly vulnerable to this attack?

I know that an argument against on-device scanning is that people don't like to feel like the device that they own is acting against them - like it's snitching on them. I can understand and actually sympathise with that argument, it feels wrong.

But we have known for a long time that computer vision can be fooled with adversarial images. What is special about this particular example? Is it only because it's specifically tricking the Apple CSAM system, which is currently a hotly-debated topic, or is there something particularly bad here, something that is not true with other CSAM "detectors"?

I genuinely don't know enough about this subject to comment with anything other than questions.

[+] vesinisa|4 years ago|reply
Now this offers Apple a very delicate opportunity to back out of the whole scanning controversy due to technological vulnerabilities.
[+] SXX|4 years ago|reply
Some people here in comments believe that whoever gonna check reported material on Apple side will never ever flag false-positive. We already know that NCMEC database itself don't exclusively contain child porn, but also some other photos that closely related ot CSAM. Even if those photos don't have actual CSAM on them. But let's ignore this fact.

Do people who believe in behevolent Apple understand that CSAM don't all have some big red "CHILD PORN" sign on it? No demonic feel included. Like any porn we can suppose that many of such images might not even have any faces or literally anything that make them different from images of 18yo person.

When you think well about it brute-forcing actual porn or some jailbait images doesn't sound that impossible. All you need is a lot of totally legal porn and some compute power. Both we have in abundance.

[+] UncleMeat|4 years ago|reply
Why is this meaningfully different than, say, what Google Photos has been doing for years?

If you can get rooting malware on the target device then you could

1. Produce actual CSAM rather than a hash collision

2. Produce lots of it

3. Sync it with Google Photos

This attack has been available for many years and does not need convoluted steps like hash collisions if you have the means to control somebody's phone with a RAT.

[+] reacharavindh|4 years ago|reply
I admit that I have not done enough research to have a strong opinion on this, but why is Apple taking this on themselves? As far as I can see, this outrage is because of "scanning on iPhone" that is wildly out of user's control. Why can't Apple be like others and say we scan the shit out of what you upload to iCloud(and it is in our Terms and Conditions to use iCloud)?

Almost all tech people know that iCloud (or its backups) are not encrypted, so they can decrypt it as they wish.. and those among us that are privacy conscious can comfortable turn iCloud OFF(or not sign up for one) and use the iPhone as a simple private device?

[+] HumblyTossed|4 years ago|reply
This can also be used to make Apple's system useless, no? If enough (millions?) of people were to, say, go to a web site and save generated gray-blobs to their phones, it would create enough false-positives to kill this system, right? Maybe game it and have everyone convert their various profile pics to these images.
[+] nojito|4 years ago|reply
Any matches are matched again server side to thwart this type of attack.

>Once Apple's iCloud Photos servers decrypt a set of positive match vouchers for an account that exceeded the match threshold, the visual derivatives of the positively matching images are referred for review by Apple. First, as an additional safeguard, the visual derivatives themselves are matched to the known CSAM database by a second, independent perceptual hash. This independent hash is chosen to reject the unlikely possibility that the match threshold was exceeded due to non-CSAM images that were adversarially perturbed to cause false NeuralHash matches against the on-device encrypted CSAM database. If the CSAM finding is confirmed by this independent hash, the visual derivatives are provided to Apple human reviewers for final confirmation.

They also fuzz this process by sending false positives I think?

[+] magpi3|4 years ago|reply
Can someone ELI5? I understand that a person can now generate an image with the same hash as an illegal image (such as child porn), but I don't understand how they can get it on someone's phone and I don't understand why someone would get in trouble for an image, when finally examined, that is clearly not child pornography.
[+] sierpinsky|4 years ago|reply
"According to media reports, the cloud computing industry does not take full advantage of the existing CSAM screening toolsto detect images or videos in cloud computing storage. For instance, big industry players, such as Apple, do not scan their cloud storage. In 2019, Amazon provided only eight reports to the NCMEC, despite handling cloud storage services with millions of uploads and downloads every second. Others, such as Dropbox, Google and Microsoft perform scans for illegal images, but 'only when someone shares them, not when they are uploaded'." [1]

So I guess the question is what exactly "others" are doing, 'only when someone shares them, not when they are uploaded'. The whole discussion seems to center around what Apple intends to do on-device, ignoring what others are already doing in the cloud. Isn't this strange?

[1] https://www.europarl.europa.eu/RegData/etudes/BRIE/2020/6593...

[+] neximo64|4 years ago|reply
Maybe the best idea is for a sufficient number of people to replicate the hashes with collisions from the CSAM database and make copies of photos with nothing in them like this one and just let Apple deal with it. Maybe it can have text too.
[+] jakear|4 years ago|reply
Baseless speculation: this is less about ongoing protection against keeping CSAM on their servers and more about a one-time sting operation requested by some sort of acronym’d official. The basic idea being: have a bunch of catfish agents send out these known CSAM images to folks through anonymous channels (sourcing the targets would likely be unfortunately trivial), expect that some of them will save the images and be synced to iCloud, coerce Apple into letting them see who has the material, sting them.

To this end, I would not be at all surprised to see that in some not-too-distant future Apple issues a big ol’ public apology and removes this feature. The operation will of course be long complete by then.

Just writing this out here so I have a “I told you so” link for if/when that time comes :)

[+] avsteele|4 years ago|reply
Apple claimed 'one in a trillion' chance of a collision. This is a great example of why you should not trust such an assessment.
[+] deadalus|4 years ago|reply
Expectation : Political rivals and enemies of powerful people will be taken out because c-ild pornography will be found in their phone. Pegasus can already monitor and exfiltrate every ounce of data right now, it won't be that hard to insert compromising images on the infected device.

Any news about "c-ild porn" being found on someone's phone is suspect now. This has been done before :

1) https://www.deccanchronicle.com/technology/in-other-news/120...

2) https://www.independent.co.uk/news/uk/crime/handyman-planted...

3) https://www.theatlantic.com/notes/2015/09/how-easily-can-hac...

4) https://www.nytimes.com/2016/12/09/world/europe/vladimir-put...

5) https://www.cnet.com/tech/services-and-software/a-child-porn...

[+] azinman2|4 years ago|reply
It’s not enough to have collisions — you’ll have to have collisions against ANOTHER hidden model for the same photos, and then have it pass a human looking at it. People are getting unnecessarily angry over this.
[+] newArray|4 years ago|reply
Hash collisions happen by design in the perceptual hash, its supposed to give equal hashes for small changes after all.

Something I find interesting is the necessary consequences of the property of small edits resulting in the same hash. We can show that this is impossible to absolutely achieve, or in other words there must exist an image such that changing a single pixel will change the hash.

Proof: Start with 2 images, A and B, of equal dimension, and with different perceptual hashes h(A) and h(B). Transform one pixel of A into the corresponding pixel of B and recompute h(A). At some point, after a single pixel change, h(A) = h(B), this is guaranteed to happen before or at A = B. Now A and the previous version of A have are 1 pixel apart, but have different hashes. QED

We can also ATTEMPT to create an image A with a specified hash matching h(A_initial) but which is visually similar to a target image B. Again start with A and B, different images with same dimensions. Transform a random pixel of A towards a pixel of B, but discard the change if h(A) changes from h(A_initial). Since we have so many degrees of freedom for our edit at any point (each channel of each pixel) and the perceptual hash invariant is in our favor, it may be possible to maneuver A close enough to B to fool a person, and keep h(A) = h(A_initial).

If this is possible one could transform a given CSAM image into a harmless meme while not changing the hash, spread the crafted image, and get tons of iCloud accounts flagged.