top | item 34008021

(no title)

picozeta | 3 years ago

What I would really like to know - what happens if one trains that model from scratch (or is that not possible and training requirements are different? Sry for my ignorance, I never fine-tuned any diffusion model before)?

In my experience (CNN based imagery segmentation) proven architectures (e.g. U-Net) performed similar with or without fine-tuning existing models (that have been mostly trained on imagenet, citiscapes, etc.) IF the domain was rather different.

At least in the field of imagery segmentation there is not much of a point in fine-tuning an off-the-shelf model on let's say medical imagery.

So maybe it's the same for the stable diffusion model. I don't see how some knowledge about the relationship between the prompt and given imagery describing that prompt should help this model map the prompt to a spectrogram of the given prompt.

discuss

No comments yet.