(no title)
maalber | 11 months ago
If you are into visual GenAI you have probably already seen many examples of quite incredible outputs from the new model. However, we decided that we wanted to make a large scale evaluation, based on 200k human responses across 13k image pairings.
Unfortunately that also meant that we had to generate a large amount of new images, and since OpenAI have not yet opened up API access, we had to do it manually through the UI :(.
The benchmark tests the model in coherence, prompt-alignment, and overall aesthetic preference. Especially for the first two, OpenAI's new model is very far ahead of the competition.
Check out the detailed results and the collected data which is openly available on huggingface!
Let me know if you have questions or feedback!
No comments yet.