When I was in the advertising business, one of the core products I created was a brandsafety product - basically preventing advertisers from advertising on dodgy sites.
I messed around with algorithms that detected nudity (because big brand advertisers don't want their ads showing up on porn sites). One of the more interesting and simple-to-use one is actually a simple averaging of the images across multiple samples. That one was easy to implement and has relatively good results.
In the end though, I ended up not using it because text clustering algorithms worked better in classifying content.
"The training set for the skin filter consisted of 1,182,608 manually labeled skin pixels and 10,471,553 manually labeled non-skin pixels while the testing set consisted of 2,303,824 manually labeled skin pixels and 24,285,952 manually labeled non-skin pixels."
This could only be a very rough first-pass on detection. Bathing suits can be very skimpy without being fully nude.
And social context plays a large role, for instance distinguishing between a fat male's nipples and a small-chested female's nipples would be impossible without analyzing a lot more than skin color.
Seems like a not very scalable approach to the problem. I would think if you wanted to capture all nudity (including monochromatic or illustrated), you would instead go at the problem from the angle of titillation. You could even round up images that are not necessarily human based (fruit arranged provocatively, for instance).
how do these nudity detection API work? Is there a crowdsourcing going underneath the hood? Are they using some clustering algorithm to detect a range of skin color (if 90%), it's nude.
Protip: If you film pornography in black and white, there is no such thing as "skin color."
If you remap the palette for a desaturated image, so that everyone's skin is green or magenta, are there any fewer penises penetrating vaginas in the image? If it's a horse and a fully clothed person's mouth, where is the algorithm for that?
[+] [-] chewxy|12 years ago|reply
I messed around with algorithms that detected nudity (because big brand advertisers don't want their ads showing up on porn sites). One of the more interesting and simple-to-use one is actually a simple averaging of the images across multiple samples. That one was easy to implement and has relatively good results.
In the end though, I ended up not using it because text clustering algorithms worked better in classifying content.
[+] [-] nathancahill|12 years ago|reply
That's a lot of pixels to manually label.
[+] [-] fryguy|12 years ago|reply
[+] [-] tachyonbeam|12 years ago|reply
[+] [-] adam-f|12 years ago|reply
And social context plays a large role, for instance distinguishing between a fat male's nipples and a small-chested female's nipples would be impossible without analyzing a lot more than skin color.
http://i.imgur.com/sb6Iw.jpg
[+] [-] dclowd9901|12 years ago|reply
[+] [-] benched|12 years ago|reply
[+] [-] ivansavz|12 years ago|reply
[+] [-] saganus|12 years ago|reply
[+] [-] ganduG|12 years ago|reply
[+] [-] jplur|12 years ago|reply
[+] [-] plg|12 years ago|reply
1. attempt to post the image to google+
2. if the post is there, there is no nudity;
[+] [-] gillier|12 years ago|reply
[deleted]
[+] [-] nader|12 years ago|reply
[+] [-] notastartup|12 years ago|reply
[+] [-] cmelbye|12 years ago|reply
[+] [-] saurik|12 years ago|reply
[+] [-] organelle|12 years ago|reply
If you remap the palette for a desaturated image, so that everyone's skin is green or magenta, are there any fewer penises penetrating vaginas in the image? If it's a horse and a fully clothed person's mouth, where is the algorithm for that?
Think about it.