top | item 43918542

(no title)

eminence32 | 9 months ago

This seems neat, I guess. But whenever I try tools like this, I often run into the limits of what I can describe in words. I might try something like "Add some clutter to the desk, including stacks of paper and notebooks" but when it doesn't quite look like what I want, I'm not sure what else to do except try slightly different wordings until the output happens to land on what I want.

I'm sure part of this is a lack of imagination on my part about how to describe the vague image in my own head. But I guess I have a lot of doubts about using a conversational interface for this kind of stuff

discuss

order

monster_truck|9 months ago

Chucking images at any model that supports image input and asking it to describe specific areas/things 'in extreme detail' is a decent way to get an idea of what its expecting vs what you want.

thornewolf|9 months ago

+1 to this flow. I use the exact same phrase "in extreme detail" as well haha. Additionally, I ask the model to describe what prompt it might write to produce some edit itself.

crooked-v|9 months ago

I just tried a couple of cases that ChatGPT is bad at (reproducing certain scenes/setpieces from classic tabletop RPG adventures, like the weird pyramid from classic D&D B4 The Lost City), and Gemini fails in just about the same way of getting architectural proportions and scenery details wrong even when given simple, broad rules about them. Adding more detail seems kind of pointless when it can't even get basics like "creature X is about as tall as the building around it" or "the pyramid is surrounded by ruined buildings" right.

BoorishBears|9 months ago

What's an example of a prompt you tried and it failed on?

qoez|9 months ago

Maybe that's how the future will unfold. There will be subtle things AI fails to learn, and there will be differences in skills in how good people are at making AI do things, which will be a new skill in itself and will end up being determining difference in pay in the future.

gowld|9 months ago

This is "Prompt Engineering"

xbmcuser|9 months ago

ask Gemini to word your thoughts better then use those to do the image editing.

metalrain|9 months ago

Exactly, describing more complex compositions, lighting, image enchancements/filters there is so many things you know how it looks but to describe it such that LLM gets it and will reproduce it is pretty difficult.

Sometimes sketching it could be helpful, but more abstract technical thing like LUTs, feels still out of reach.

betterThanTexas|9 months ago

> I'm sure part of this is a lack of imagination on my part about how to describe the vague image in my own head.

This is more related to our ability to articulate than is easy to demonstrate, in my experience. I can certainly produce images in my head I have difficulty reproducing well and consistently via linguistic description.

SketchySeaBeast|9 months ago

It's almost as if being able to create art accurate to our mental vision requires practice and skill, be it the ability to create an image or to write it and invoke an imagine in others.

Nevermark|9 months ago

Perhaps describe the types and styles of work associated with the desk, to create a coherent character to the clutter

bufferoverflow|9 months ago

In that scenario, if you can't describe what you want with words, a human designer can't read your mind either.

Hasnep|9 months ago

No, but a good designer will be able to help you put what you want into words.

zoogeny|9 months ago

I would politely suggest you work at getting better at this since it would be a pretty important skill in a world where a lot of creative work is done by AI.

As some have mentioned, LLMs are treasure troves of information for learning how to prompt the LLM. One thing to get over is a fear of embarrassment in what you say to the LLM. Just write a stream of consciousness to the LLM about what you want and ask it to generate a prompt based on that. "I have an image that I am trying to get an image LLM to add some clutter to. But when I ask it to do it, like I say add some stack of paper and notebooks, but it doesn't look like I want because they are neat stacks of paper. What I want is a desk that kind of looks like it has been worked at for a while by a typical office worker, like at the end of the day with a half empty coffee cup and .... ". Just ramble away and then ask the LLM to give you the best prompt. And if it doesn't work, literally go back to the same message chain and say "I tried that prompt and it was [better|worse] than before because ...".

This is one of those opportunities where life is giving you an option: give up or learn. Choose wisely.