(no title)
GusRuss89 | 4 years ago
I won't claim to have a good knowledge of the inner workings, but basically CLIP will give a score for how well the image matches your prompt, and that's used as a loss function for the GAN.
On the Colab notebooks you can specify multiple prompts with different weights, in which case I assume it has multiple loss functions that it's trying to optimise for.
I'm more of a web app guy than a deep tech guy. Some of what I wrote can probably be corrected by someone with a better knowledge of the ML.
No comments yet.