top | item 45499848

(no title)

xvv | 4 months ago

This model did suspiciously well on the pelican test. Simon has mentioned something along the lines of AI companies cheating his benchmark, and using other random absurd prompts to thwart it.

discuss

simonw|4 months ago

Dedicated image generation models like this one and Midjourney and DALL-E and Nano Banana and Stable Diffusion have all been able to generate really good pelicans riding bicycles for a couple of years now, but only as JPG/PNG/WEBP raster images.

The real pelican test is about if a text producing model can spit out SVG code that renders a good vector illustration, which is a much harder (and sillier) task.

xvv|4 months ago

Oh shoot, good catch. No more HN comments after midnight!