If it doesn't understand meaning and context, it can't work.
It will over-report on video of baby's first bath, beach volleyball, and high school wrestling; it will under-report on many of the kinkier or more subtle forms of porn.
Humans get this stuff wrong all the time simply by having different cultural contexts; there's no way the 2022 state of the art is up to the challenge.
Fully agree - This seems like a project with rather limited in scope. And that's ok. The authors even acknowledge it.
It's not an absolute judge of SFW'ness, neither does it claim to be. I can see it as a tool to skim a large set of videos for further human review.
Besides, beach volleyball very much is NSFW depending on the situation.
I want to believe it's a matter of adding human labor to the Pool of Knowledge in the AI engine(s); this is, this is not, go do the learning thing.
But when it comes to online video checking, it's another tool in the arsenal; it's the first lines of defense. First check: does the signature match a previously known and confirmed video marked as porn or otherwise unacceptable. Second check: does the more fuzzy AI thing consider it porn with a high probability. Third check, which will be things the AI mark as 'not sure' or incorrectly marks as 'porn', and things humans flag up manually, will be humans checking and judging things.
And of course it'll be flexible, because the definition of porn - or what is 'not acceptable' - will vary by country and culture, and these companies want to be active everywhere.
except Europe, unless they are given tax breaks and data, amirite
Why does everything have to be perfect? Humans get this wrong all the time as you say, and we still do it - right?
Seems like we already acknowledge good enough practices and are okay with them. Lets not demand perfection in the same way we shouldn't assume perfection of ~~AI~~ ML.
It could potentially revolutionize the opposite application, those sites where someone manually finds and catalogs all the time points at which movies contain nudity so people can see their favorite celebs in the buff. False positives are probably worth the automation there.
yeah but this is true for profanity filters as well, and yet useful profanity filters still exist. Is it 100% accurate? No. But it's probably still useful.
- Babies (https://github.com/wingman-jr-addon/wingman_jr/issues/22)
- Beach volleyball (but this definitely has SFW and NSFW variants, based on a somewhat subjective line)
- Athletes in general. The model particularly thought some American football players were NSFW for a long time.
- Swimming
- Yoga - again, most SFW and some NSFW here but it still struggles
- Wrestling was a tough one for sure
- Pokemon
While indeed tough, I've seen definite progress. So it's not just a matter of tech, but also of considering the human element - the state of the art may not be up to the challenge of perfection, but it is definitely up to a point of true utility for some use cases. I'm happy about that.
As a note, it uses an EfficientNet Lite L0 backbone - I'm a bit limited in what type of scanning I can perform in a sufficiently speedy manner.
I also agree on the context for sure - one reason I haven't tried switching to an object detection method (and that I don't rely heavily on truly random crops) is that the focus of the image is highly important for the NSFW-ness in some cases. True, two images may contain the same content ... but one is far worse than the other. The nature of CNN's still has some of this location-invariance baked in, but I don't want to exacerbate it.
One challenge I think the OP may run into here that may also not be immediately obvious is that accuracy on image stills does not translate that well to video. I have basic video support in my addon, and while I knew there would be some differences, I was surprised at how many discrepancies there really are. As two examples:
- Images in video are often blurrier. In true still images, there is a somewhat higher prior involved with amateur NSFW content and blurriness. This can be a source of false positives.
- The opposite of the note above about focus. Taking stills of moving images will have many transitory frames that seem inappropriate on their own because it seems as if they are focusing on something when in reality the camera is just panning - obvious to the human, less so to the model trained on stills.
At any rate, given how well your list of edge cases coincided with failures I've grappled with, I'd be interested to see how well you think my addon stacks up for still images when set to stay in "normal" mode. I'd love to hear any feedback you have via GitHub so I can make it better.
If this was for a product, the first thing I would have done, was drop half the tools in the stack for their inefficiency! For example, the GUI shell is using Node Webkit (which I prefer to Electron, but is essentially the same thing). That in itself is quite bad, but it's not the worst approach to building a desktop app, as Microsoft and Slack have already proven.
But the .exe's you mentioned are also quite large.
The code is very transparent though.
I doubt you'd have been the only one to note how clunky this all is!
Interesting subject and lots of praise to be had if the model can be made accurate, but seems it's not really there yet. I wish you luck in getting there!
Here's a blog post where I explain my motivation for creating the program - https://raskie.com/post/practical-ai-autodetecting-nsfw. There are some hyperparameters to tune, but you're right, it's extremely flaky. Part of that might be down to my work (I personally wanted the app to be prudish enough to flag the Baywatch intro), but the off the shelf model I used for this, does produce some strange results. Kind of wanted to use this app as a convo piece to discuss AI with people who aren't entirely technical, or with technical people who wonder what AI can and can't do.
I would suppose that depends on the workplace and area(and time) where you live. It is definitely "somewhat" sexualised content in my opinion (close up of a womens body stripping down), even though there is no nipple to see. But sure, lots of normal advertisement sells more sex. So is the red line, when a nipple is shown? That would classify biological and medical content as nsfw.
My point is, human morals are very different. Humans fight about, what content is NSFW. A computer model can never be "acurate" in that sense.
There's no such thing as a perfect model. The acknowledgement is welcome.
Per the Baywatch example - there's clearly a grey area here, and in my opinion the model is working as intended. False positives for NSFW are better than false negatives.
Clearly incorrect by any reasonable definition unless you work at a swimwear company or a few bits of the media.
I mean, you or I might think Alexandra Paul in a one-piece is an image approaching the beauty of classical sculpture, but quite a lot of classical sculpture is NSFW in most of the world, and the brow is a bit lower where Baywatch is concerned.
I would consider the Baywatch intro NSFW. Perhaps you don't consider it so because it's Baywatch, but if it was a random video of closeups of people in bathing costumes I bet you would. One thing I find really interesting with AI/Machine Learning is how it can cause you to re-evaluate your own biases. I say the AI is right here and you are wrong.
I wonder if it would be possible to not only single scenes in a movie out, but also use imdb data to help detect which actor/actress is in said scenes for categorization. Could be an extremely useful tool.
That's an interesting idea. There are off the shelf tools that can recognise famous people already (Amazon Rekognition, for example). The problem in connection with this hack, is in not wanting to call a third party web service.
Wouldn't it be easier to use facial recognition for that? In fact I recall years ago Google Play Movies already did that, where you could pause the movie and it would tell you the name of the faces on screen at that moment.
Or do you mean if you're looking for NSFW scenes of a particular actor?
Shameless plug: At PixLab, we offer similar model available as a REST API endpoint: https://pixlab.io/cmd?id=nsfw. The NSFW API endpoint which let you detect bloody & adult content. This can help the developer automate things such as filtering users' image uploads. A tutorial on using such API is available on https://dev.to/unqlite_db/filter-image-uploads-according-to-....
The solution can also be deployed on-promises for real-time, local video analysis without leaving the deployment environment: https://pixlab.io/on-premises.
Seems like this might be the use case for OPs project. A overly sensitive filter that allows you to send only positive alerts to an external API for further validation.
Surprised there is no free tier here but I’m not experienced in this space.
[+] [-] dsr_|4 years ago|reply
It will over-report on video of baby's first bath, beach volleyball, and high school wrestling; it will under-report on many of the kinkier or more subtle forms of porn.
Humans get this stuff wrong all the time simply by having different cultural contexts; there's no way the 2022 state of the art is up to the challenge.
[+] [-] FrostKiwi|4 years ago|reply
[+] [-] Cthulhu_|4 years ago|reply
But when it comes to online video checking, it's another tool in the arsenal; it's the first lines of defense. First check: does the signature match a previously known and confirmed video marked as porn or otherwise unacceptable. Second check: does the more fuzzy AI thing consider it porn with a high probability. Third check, which will be things the AI mark as 'not sure' or incorrectly marks as 'porn', and things humans flag up manually, will be humans checking and judging things.
And of course it'll be flexible, because the definition of porn - or what is 'not acceptable' - will vary by country and culture, and these companies want to be active everywhere.
except Europe, unless they are given tax breaks and data, amirite
[+] [-] unknown|4 years ago|reply
[deleted]
[+] [-] lijogdfljk|4 years ago|reply
Seems like we already acknowledge good enough practices and are okay with them. Lets not demand perfection in the same way we shouldn't assume perfection of ~~AI~~ ML.
[+] [-] nonameiguess|4 years ago|reply
[+] [-] zffr|4 years ago|reply
[+] [-] jsharf|4 years ago|reply
[+] [-] wingman-jr|4 years ago|reply
As a note, it uses an EfficientNet Lite L0 backbone - I'm a bit limited in what type of scanning I can perform in a sufficiently speedy manner.
I also agree on the context for sure - one reason I haven't tried switching to an object detection method (and that I don't rely heavily on truly random crops) is that the focus of the image is highly important for the NSFW-ness in some cases. True, two images may contain the same content ... but one is far worse than the other. The nature of CNN's still has some of this location-invariance baked in, but I don't want to exacerbate it.
One challenge I think the OP may run into here that may also not be immediately obvious is that accuracy on image stills does not translate that well to video. I have basic video support in my addon, and while I knew there would be some differences, I was surprised at how many discrepancies there really are. As two examples:
At any rate, given how well your list of edge cases coincided with failures I've grappled with, I'd be interested to see how well you think my addon stacks up for still images when set to stay in "normal" mode. I'd love to hear any feedback you have via GitHub so I can make it better.[+] [-] dncornholio|4 years ago|reply
[+] [-] dynamite-ready|4 years ago|reply
If this was for a product, the first thing I would have done, was drop half the tools in the stack for their inefficiency! For example, the GUI shell is using Node Webkit (which I prefer to Electron, but is essentially the same thing). That in itself is quite bad, but it's not the worst approach to building a desktop app, as Microsoft and Slack have already proven.
But the .exe's you mentioned are also quite large.
The code is very transparent though. I doubt you'd have been the only one to note how clunky this all is!
[+] [-] MayeulC|4 years ago|reply
[+] [-] capableweb|4 years ago|reply
> it determines the Baywatch intro to be somewhat NSFW at a glance
As nothing in the Baywatch intro is NSFW (https://www.youtube.com/watch?v=O0nqwgu_Us4), it seems the model is failing at large here.
Interesting subject and lots of praise to be had if the model can be made accurate, but seems it's not really there yet. I wish you luck in getting there!
[+] [-] heeen2|4 years ago|reply
a debatable claim. There are several scenes in the intro I would not want to have paused fullscreen in a work environment.
[+] [-] dynamite-ready|4 years ago|reply
[+] [-] hutzlibu|4 years ago|reply
I would suppose that depends on the workplace and area(and time) where you live. It is definitely "somewhat" sexualised content in my opinion (close up of a womens body stripping down), even though there is no nipple to see. But sure, lots of normal advertisement sells more sex. So is the red line, when a nipple is shown? That would classify biological and medical content as nsfw.
My point is, human morals are very different. Humans fight about, what content is NSFW. A computer model can never be "acurate" in that sense.
[+] [-] binarymax|4 years ago|reply
Per the Baywatch example - there's clearly a grey area here, and in my opinion the model is working as intended. False positives for NSFW are better than false negatives.
[+] [-] unfocussed_mike|4 years ago|reply
Clearly incorrect by any reasonable definition unless you work at a swimwear company or a few bits of the media.
I mean, you or I might think Alexandra Paul in a one-piece is an image approaching the beauty of classical sculpture, but quite a lot of classical sculpture is NSFW in most of the world, and the brow is a bit lower where Baywatch is concerned.
[+] [-] globular-toast|4 years ago|reply
[+] [-] dustymcp|4 years ago|reply
[+] [-] soylentcola|4 years ago|reply
[+] [-] alexb_|4 years ago|reply
[+] [-] dynamite-ready|4 years ago|reply
[+] [-] hbn|4 years ago|reply
Or do you mean if you're looking for NSFW scenes of a particular actor?
[+] [-] bojangleslover|4 years ago|reply
[+] [-] snek_case|4 years ago|reply
[+] [-] scyzoryk_xyz|4 years ago|reply
[+] [-] symisc_devel|4 years ago|reply
The solution can also be deployed on-promises for real-time, local video analysis without leaving the deployment environment: https://pixlab.io/on-premises.
[+] [-] SomaticPirate|4 years ago|reply
Surprised there is no free tier here but I’m not experienced in this space.
[+] [-] nottorp|4 years ago|reply
[+] [-] sschueller|4 years ago|reply
The fact that our modern society thinks nudity and sex is bad but violence, torture and death are OK I just don't understand.
I would much rather have my kid see the Pamela Anderson sex tape than someone being murdered or even water boarded.