One thing that couldn't be done is transparent background. The model just generates the pattern in the background. Not real alpha channel transparency. You can even see artifacts in the pattern.
The training data is presumably full of examples of people using the pattern to indicate transparency (and explaining that they do so — like the input for 50!), and much less of people actually creating such images (if the training data even preserves the alpha channel in the first place).
I think a bigger problem is the "artifacts" you describe (worse than that sounds to me).
Yeah, mangled checkerboard patterns are common when prompted to "remove" the background. It can be worked around by generating multiple images with only the background color varying (e.g. black and white) and reconstructing the alpha channel from their difference, as the model generally prefers to just copy and paste when no other prompts override that preference.
“Just do more manual work and waste even more energy so you can take yet another manual step and finally get what you wanted.” A real time-saver, that.
zahlman|5 months ago
I think a bigger problem is the "artifacts" you describe (worse than that sounds to me).
lifthrasiir|5 months ago
filoeleven|5 months ago