top | item 45972738

(no title)

Yeah. The new challenge seems easier to solve since it basically is hand-holding the LLMs into what the result should look like.

I think a more challenging, well, challenge, would be to offer an even more absurd scenario and see how the model handles it.

Example: generate an svg of a pelican and a mongoose eating popcorn inside a pyramid-shaped vehicle flying around Jupiter. Result: https://imgur.com/a/TBGYChc

discuss

simonw|3 months ago

I like the hand-holding because it's a better test of how well models can follow more detailed instructions.

I was inspired by Max Woolf's nano banana test prompts: https://minimaxir.com/2025/11/nano-banana-prompts/

ahmedfromtunis|3 months ago

That's a valid point but I'd argue the new test would be then interesting to couple with the original one, not to replace it.

Do you think it would be reasonable to include both in future reviews, at least for the sake of back-compatibility (and comparability)?

ethmarks|3 months ago

Which model did you use in the example result? It looks fantastic.