(no title)
_zephyr | 3 years ago
It's been quite a journey since then! I developed the bulk of the code but have had a bit of community help as well. There is now a small but steady user base, and I've learned so much.
* NSFW content detection is a hard task - I believe much harder than just general image recognition, and whereas many think of of image recognition as "solved", I would argue that NSFW content detection is far from it.
* Having your own dedicated toolset to assist in creating and managing the dataset is invaluable. I was surprised to see that as a single individual it was quite feasible to create a fully-curated dataset in the hundreds of thousands.
* Servers are more different than laptops/desktops than one might think. I had a good time setting up an old Supermicro with K80's on a budget.
* For this problem at least, even several hundred thousand images was still not enough to see an advantage on training from scratch vs. tuning a pre-trained model.
* Fun ways to slice up GIFs at a "transport" layer to allow for good-enough filtering on frames.
* Character encoding detection is a hard problem, and the existing filtering API's don't do a good job of helping the developer down the right path.
So with development being relatively stable for quite some time, why share now? Well, I've kept the addon fairly "primordial" in the sense that I haven't tried to cater too heavily to narrowed use cases yet. Three general use cases seem to be represented based on user feedback - there are others but so far this is what is being said:
* Adults casually enabling it for daily browsing. Think things like browsing stock photo sites, Google Images, etc.
* Adults struggling with pornography and looking for tools to help.
* Adults looking for an extra safety net when their kids browse the web.
I'm contemplating adding more specific feature sets in one or more of these areas, but thought that it might be a good time to put it out there and get some perspectives. The tech is far from perfect, but it seems that it is good enough that it is helpful for some users.
It's also my hope that there's potentially some things to share here that the HN crew might find of interest. (Although I can assure the UX is not currently one of them!)
* The addon itself (https://addons.mozilla.org/en-US/firefox/addon/wingman-jr-fi... and https://github.com/wingman-jr-addon/wingman_jr) - maybe it's not your thing, but if you're like me, there's a good chance somebody in your family might find it useful
* The model (https://github.com/wingman-jr-addon/model) - I've tried to make a competitive model for its size, such that enterprising individuals can try this as an alternative to paying for API calls. It does not use NSFW.js as a base. Both .h5 and TF.js models are provided. Maybe it'll be good enough for your use case?
* Real world examples of the webRequest.StreamFilter API. In particular, the bit about character encoding is probably worth a short read if you're thinking of using this API yourself. See https://github.com/wingman-jr-addon/wingman_jr/blob/79a1a882...
* Examples of image-based logging for Firefox.
* Some fun GIF parsing stuff!
Thanks!
itake|3 years ago
_zephyr|3 years ago