(no title)
convexfunction | 3 years ago
So, if you take a Stable Diffusion checkpoint (call it "A") which is only lightly trained on some subset of an artist's work, then fine tune it on the full corpus of that artist's work to a point where it's still coherent/"good" and just shy of actually memorizing the fine tuning data (call the resulting model "B"), then define model "C" as 2A-B (i.e. A + (A-B), where A-B is the artist's task vector multiplied by -1), can you still produce qualitatively similar images with model C? Whether with the exact same prompt, or the same prompt with "in the style of Kinkade" removed (which doesn't mean as much if Kinkade's task vector was subtracted), or with any prompt whatsoever?
Lots of issues with this as laid out -- it's definitely not quite the same as "forgetting" Kinkade from the training data, and "any prompt whatsoever" introduces tons of leeway, and most good AI-assisted art is not just an unmodified single text-to-image output anyway -- but it might be a promising direction to explore.
(Strongly disagree with the "copyright laundry" characterization, by the way.)
No comments yet.