(no title)
bitwise-evan | 4 years ago
This is a very real, possible attack. Apple ships its CSAM model on device so any attacker can have a copy of the model. Then the attacker creates an image that triggers CSAM but looks like a panda [1]. Now the attacker sends tons of triggering photos to the unsuspecting victim, who now gets questioned by the FBI.
1: https://medium.com/@ml.at.berkeley/tricking-neural-networks-...
shadowfacts|4 years ago
That's glossing over the middle part where a human from Apple (before it even gets to law enforcement) actually look at the images and goes "oh, these are actually pandas" and realizes they were erroneously detected.
carom|4 years ago
simondotau|4 years ago
And if this trick ever works, it could only be done once before Apple has the opportunity to plug holes in their NeuralHash algorithm and fix any deficiencies in the manual review process.