(no title)
curiousllama | 1 year ago
If the FPR is comparable to asking a human "are these the same image?", then it would seem to be equivalent to a visual search. I wonder if (or why) human verification is actually necessary here.
curiousllama | 1 year ago
If the FPR is comparable to asking a human "are these the same image?", then it would seem to be equivalent to a visual search. I wonder if (or why) human verification is actually necessary here.
EVa5I7bHFq9mnYK|1 year ago
bluGill|1 year ago
cool_dude85|1 year ago
ARandumGuy|1 year ago
Of course, that's assuming a perfect, evenly distributed hash algorithm. And that's just the odds that any given pair of images has the same hash, not the odds that a hash conflict exists somewhere on the internet.
henryfjordan|1 year ago
If you have a 32bit hash but your input is only 16bit, you'll never have a collision (and you'll be wasting a ton of space on your hashes!).
Image files can get into the megabytes though, so unless the output hash is large the potential for collisions is probably not all that low.
gorjusborg|1 year ago
No way to know without knowledge of the 'proprietary hashing technology'. Theoretically though, a hash can have infinitely many inputs that produce the same output.
Mismatching hash values from the same hashing algorithm can prove mismatching inputs, but matching hash values don't ensure matching inputs.
> I wonder if (or why) human verification is actually necessary here
It's not about frequency, it's about criticality of getting it right. If you are going to make a negatively life-altering report on someone, you'd better make sure the accusation is legitimate.
cool_dude85|1 year ago
Most anyone would agree that the hash matching should probably form probable cause for a warrant, allowing a judge to sign off on the police searching (i.e., viewing) the image. So, if it's a collision, the cops get a warrant and open up your linux ISO or cat meme, and it's all good. Probably the ideal case is that they get a warrant to search the specific image, and are only able to obtain a warrant to search your home and effects, etc. if the image does appear to be CSAM.
At issue here is the fact that no such warrant was obtained.
nokcha|1 year ago
See also:
https://en.wikipedia.org/wiki/Collision_resistance
https://en.wikipedia.org/wiki/Preimage_attack
int_19h|1 year ago
Instead, systems like these use perceptual hashing, in which similar inputs produce similar hashes, so that one can test for likeness. Those have much higher collision rates, and are also much easier to deliberately generate collisions for.
unknown|1 year ago
[deleted]