Show HN: An AI program to check videos for NSFW content

[+] dsr_|4 years ago|reply

If it doesn't understand meaning and context, it can't work.

It will over-report on video of baby's first bath, beach volleyball, and high school wrestling; it will under-report on many of the kinkier or more subtle forms of porn.

Humans get this stuff wrong all the time simply by having different cultural contexts; there's no way the 2022 state of the art is up to the challenge.

[+] FrostKiwi|4 years ago|reply

Fully agree - This seems like a project with rather limited in scope. And that's ok. The authors even acknowledge it. It's not an absolute judge of SFW'ness, neither does it claim to be. I can see it as a tool to skim a large set of videos for further human review. Besides, beach volleyball very much is NSFW depending on the situation.

[+] Cthulhu_|4 years ago|reply

I want to believe it's a matter of adding human labor to the Pool of Knowledge in the AI engine(s); this is, this is not, go do the learning thing.

But when it comes to online video checking, it's another tool in the arsenal; it's the first lines of defense. First check: does the signature match a previously known and confirmed video marked as porn or otherwise unacceptable. Second check: does the more fuzzy AI thing consider it porn with a high probability. Third check, which will be things the AI mark as 'not sure' or incorrectly marks as 'porn', and things humans flag up manually, will be humans checking and judging things.

And of course it'll be flexible, because the definition of porn - or what is 'not acceptable' - will vary by country and culture, and these companies want to be active everywhere.

except Europe, unless they are given tax breaks and data, amirite

[+] unknown|4 years ago|reply

[deleted]

[+] lijogdfljk|4 years ago|reply

Why does everything have to be perfect? Humans get this wrong all the time as you say, and we still do it - right?

Seems like we already acknowledge good enough practices and are okay with them. Lets not demand perfection in the same way we shouldn't assume perfection of ~~AI~~ ML.

[+] nonameiguess|4 years ago|reply

It could potentially revolutionize the opposite application, those sites where someone manually finds and catalogs all the time points at which movies contain nudity so people can see their favorite celebs in the buff. False positives are probably worth the automation there.

[+] zffr|4 years ago|reply

Even if this can't do a 100% job, isn't it still useful to automatically identify the obvious NSFW content?

[+] jsharf|4 years ago|reply

yeah but this is true for profanity filters as well, and yet useful profanity filters still exist. Is it 100% accurate? No. But it's probably still useful.

[+] wingman-jr|4 years ago|reply

You're right that stuff is quite difficult. I write a Firefox addon (https://addons.mozilla.org/en-US/firefox/addon/wingman-jr-fi..., https://github.com/wingman-jr-addon/wingman_jr) and train an associated NSFW model (https://github.com/wingman-jr-addon/model) - I've been at it for a few years now, and have had to plug many specific edge cases.

   - Babies (https://github.com/wingman-jr-addon/wingman_jr/issues/22)
   - Beach volleyball (but this definitely has SFW and NSFW variants, based on a somewhat subjective line)
   - Athletes in general. The model particularly thought some American football players were NSFW for a long time.
   - Swimming
   - Yoga - again, most SFW and some NSFW here but it still struggles
   - Wrestling was a tough one for sure
   - Pokemon

While indeed tough, I've seen definite progress. So it's not just a matter of tech, but also of considering the human element - the state of the art may not be up to the challenge of perfection, but it is definitely up to a point of true utility for some use cases. I'm happy about that.

As a note, it uses an EfficientNet Lite L0 backbone - I'm a bit limited in what type of scanning I can perform in a sufficiently speedy manner.

I also agree on the context for sure - one reason I haven't tried switching to an object detection method (and that I don't rely heavily on truly random crops) is that the focus of the image is highly important for the NSFW-ness in some cases. True, two images may contain the same content ... but one is far worse than the other. The nature of CNN's still has some of this location-invariance baked in, but I don't want to exacerbate it.

One challenge I think the OP may run into here that may also not be immediately obvious is that accuracy on image stills does not translate that well to video. I have basic video support in my addon, and while I knew there would be some differences, I was surprised at how many discrepancies there really are. As two examples:

   - Images in video are often blurrier. In true still images, there is a somewhat higher prior involved with amateur NSFW content and blurriness. This can be a source of false positives.
   - The opposite of the note above about focus. Taking stills of moving images will have many transitory frames that seem inappropriate on their own because it seems as if they are focusing on something when in reality the camera is just panning - obvious to the human, less so to the model trained on stills.

At any rate, given how well your list of edge cases coincided with failures I've grappled with, I'd be interested to see how well you think my addon stacks up for still images when set to stay in "normal" mode. I'd love to hear any feedback you have via GitHub so I can make it better.

[+] dncornholio|4 years ago|reply

What this program seems to do is create some screenshots from an mp4 and feed them through 2 .exe files, then display the results in a React app

[+] dynamite-ready|4 years ago|reply

Yep. It's classic hack.

If this was for a product, the first thing I would have done, was drop half the tools in the stack for their inefficiency! For example, the GUI shell is using Node Webkit (which I prefer to Electron, but is essentially the same thing). That in itself is quite bad, but it's not the worst approach to building a desktop app, as Microsoft and Slack have already proven.

But the .exe's you mentioned are also quite large.

The code is very transparent though. I doubt you'd have been the only one to note how clunky this all is!

[+] MayeulC|4 years ago|reply

Yep, two random exe, and no license is just what I saw before closing the page.

[+] capableweb|4 years ago|reply

It seems to acknowledge itself as not being accurate, at least from what I gather in this passage:

> it determines the Baywatch intro to be somewhat NSFW at a glance

As nothing in the Baywatch intro is NSFW (https://www.youtube.com/watch?v=O0nqwgu_Us4), it seems the model is failing at large here.

Interesting subject and lots of praise to be had if the model can be made accurate, but seems it's not really there yet. I wish you luck in getting there!

[+] heeen2|4 years ago|reply

> nothing in the Baywatch intro is NSFW

a debatable claim. There are several scenes in the intro I would not want to have paused fullscreen in a work environment.

[+] dynamite-ready|4 years ago|reply

Here's a blog post where I explain my motivation for creating the program - https://raskie.com/post/practical-ai-autodetecting-nsfw. There are some hyperparameters to tune, but you're right, it's extremely flaky. Part of that might be down to my work (I personally wanted the app to be prudish enough to flag the Baywatch intro), but the off the shelf model I used for this, does produce some strange results. Kind of wanted to use this app as a convo piece to discuss AI with people who aren't entirely technical, or with technical people who wonder what AI can and can't do.

[+] hutzlibu|4 years ago|reply

"As nothing in the Baywatch intro is NSFW "

I would suppose that depends on the workplace and area(and time) where you live. It is definitely "somewhat" sexualised content in my opinion (close up of a womens body stripping down), even though there is no nipple to see. But sure, lots of normal advertisement sells more sex. So is the red line, when a nipple is shown? That would classify biological and medical content as nsfw.

My point is, human morals are very different. Humans fight about, what content is NSFW. A computer model can never be "acurate" in that sense.

[+] binarymax|4 years ago|reply

There's no such thing as a perfect model. The acknowledgement is welcome.

Per the Baywatch example - there's clearly a grey area here, and in my opinion the model is working as intended. False positives for NSFW are better than false negatives.

[+] unfocussed_mike|4 years ago|reply

> As nothing in the Baywatch intro is NSFW

Clearly incorrect by any reasonable definition unless you work at a swimwear company or a few bits of the media.

I mean, you or I might think Alexandra Paul in a one-piece is an image approaching the beauty of classical sculpture, but quite a lot of classical sculpture is NSFW in most of the world, and the brow is a bit lower where Baywatch is concerned.

[+] globular-toast|4 years ago|reply

I would consider the Baywatch intro NSFW. Perhaps you don't consider it so because it's Baywatch, but if it was a random video of closeups of people in bathing costumes I bet you would. One thing I find really interesting with AI/Machine Learning is how it can cause you to re-evaluate your own biases. I say the AI is right here and you are wrong.

[+] dustymcp|4 years ago|reply

Reminds me of hotdog not hotdog from silicon valley,

[+] soylentcola|4 years ago|reply

Hey, it still could be.

[+] alexb_|4 years ago|reply

I wonder if it would be possible to not only single scenes in a movie out, but also use imdb data to help detect which actor/actress is in said scenes for categorization. Could be an extremely useful tool.

[+] dynamite-ready|4 years ago|reply

That's an interesting idea. There are off the shelf tools that can recognise famous people already (Amazon Rekognition, for example). The problem in connection with this hack, is in not wanting to call a third party web service.

[+] hbn|4 years ago|reply

Wouldn't it be easier to use facial recognition for that? In fact I recall years ago Google Play Movies already did that, where you could pause the movie and it would tell you the name of the faces on screen at that moment.

Or do you mean if you're looking for NSFW scenes of a particular actor?

[+] bojangleslover|4 years ago|reply

Hot dog or not hot dog?

[+] snek_case|4 years ago|reply

Melon or not melon.

[+] scyzoryk_xyz|4 years ago|reply

Ok now can we make this work in the opposite direction - make sfw video nsfw please?

[+] symisc_devel|4 years ago|reply

Shameless plug: At PixLab, we offer similar model available as a REST API endpoint: https://pixlab.io/cmd?id=nsfw. The NSFW API endpoint which let you detect bloody & adult content. This can help the developer automate things such as filtering users' image uploads. A tutorial on using such API is available on https://dev.to/unqlite_db/filter-image-uploads-according-to-....

The solution can also be deployed on-promises for real-time, local video analysis without leaving the deployment environment: https://pixlab.io/on-premises.

[+] SomaticPirate|4 years ago|reply

Seems like this might be the use case for OPs project. A overly sensitive filter that allows you to send only positive alerts to an external API for further validation.

Surprised there is no free tier here but I’m not experienced in this space.

[+] nottorp|4 years ago|reply

NSFW for which culture?

[+] sschueller|4 years ago|reply

How do you determine if a video shows death or torture?

The fact that our modern society thinks nudity and sex is bad but violence, torture and death are OK I just don't understand.

I would much rather have my kid see the Pamela Anderson sex tape than someone being murdered or even water boarded.

79 comments