top | item 40309787

(no title)

shenberg | 1 year ago

The CLIP plot (Fig. 2) is damning, however some of the generative models show flat responses in Fig. 3 (e.g. Adobe GigaGAN, DALL-E-mini). While those are on the one hand technically linear relationships, but are also exactly what we'd want: image generation aesthetic score that doesn't care about concept frequency. Maybe the issue is with the contrastive training target used in CLIP?

discuss

No comments yet.