top | item 45986203

(no title)

Benjamin_Dobell | 3 months ago

For background removal (at least my niche use case of background removal of kids drawings — https://breaka.club/blog/why-were-building-clubs-for-kids) I think birefnet v2 is still working slightly better.

SAM3 seems to less precisely trace the images — it'll discard kids drawing out the lines a bit, which is okay, but then it also seems to struggle around sharp corners and includes a bit of the white page that I'd like cut out.

Of course, SAM3 is significantly more powerful in that it does much more than simply cut out images. It seems to be able to identify what these kids' drawings represent. That's very impressive, AI models are typically trained on photos and adult illustrations — they struggle with children's drawings. So I could perhaps still use this for identifying content, giving kids more freedom to draw what they like, but then unprompted attach appropriate behavior to their drawings in-game.

discuss

order

warangal|3 months ago

I know it may be not what you are looking for, but most of such models generate multiple-scale image features through an image encoder, and those can be very easily fine-tuned for a particular task, like some polygon prediction for your use case. I understand the main benefit of such promptable models to reduce/remove this kind of work in the first place, but could be worth and much more accurate if you have a specific high-load task !

florians|3 months ago

Curious about background removal with BiRefNet. Would you consider it the best model currently available? What other options exist that are popular but not as good?

Benjamin_Dobell|3 months ago

I'm far from an expert in this area. I've also tried Bria RMBG 1.4, Bria RMBG 2.0, older BiRefNet versions, and I think another I forgot the name of. The fact I'm removing backgrounds that are predominantly white (a sheet of paper) in first place probably changes things significantly. So it's hard to extrapolate my results to general background removal.

BiRefNet 2 seems to do a much better job of correctly removing backgrounds in between the contents outline. So like hands on hips, that region that's fully enclosed but you want removed. It's not just that though, some other models will remove this, but they'll be overly aggressive and remove white areas where kids haven't coloured in perfectly — or like the intentionally left blank whites of eyes for example.

I'm putting these images in a game world once they're cut out, so if things are too transparent, they look very odd.