top | item 32555586

(no title)

While neat, and no doubt impressive, it still utterly fails on prompts that should be completely reasonable to any sane human being/artist.

Take something like "A cat dancing atop a cow, with utters that are made out of ar-15s that shoot lazer-beam confetti". A vivid description should be aroused in your head, and no doubt, I could imagine an artist have a lot of fun creating such a description... Alas, what the model spits out is pure unusable garbage.

discuss

TulliusCicero|3 years ago

More complex/weirder prompts aren't going to work yet, no.

What will probably happen with these models is that for more advanced stuff, you may using the "inpainting" that Dall-E already has going, where you can sort of mix and match and combine images. That way you could have the cat, for example, rendered separately, thereby simplifying each individual prompt.

topynate|3 years ago

The referent of "utters" (sic) is ambiguous, so I can imagine a model having more difficulty with it than usual. Regardless, the current SOTA does need more specific and sometimes repetitive prompting than a human artist would, but it's surprising how much better results you can get from SOTA models with a bit of experience at prompt engineering.

bluejellybean|3 years ago

This is, in part, what I'm trying to point out, it's an obvious typo given the context, and something that you or I would be able to pick up on, yet it completely breaks (it spit out a bunch of weird confetti cats for me). Perhaps I'm being a little harsh, but if it requires word-perfect tuning and prompt engineering, it speaks to something about the 'stupidity' of these models. It's a neat trick, but to call it anything in the realm of artificial intelligence is a bit of a joke.

h2odragon|3 years ago

ITYM "udders"

also try "teats"