Yes, the prompt is composed of 7 different layers, where I group together coherent visual and temporal responsibilities. Depending on the scene, I usually only change 3-5 layers, but the base layers still stay the same; so the scenes all appear within the same story universe and same style. If something feels off, or feels like it needs to be improved, I just adjust one layer after the other to experiment with the results on the entire story, but also on individual scene level. Over time, I have created quite some 7-Layer style profiles, that work well, and I can cast onto different story universes. Keep in mind this is heavy experimentation, it may just be that there is a much easier way to do this, but I am seeing success with this. https://edwin.genego.io/blog/lpa-studio - at any point I may throw this all out and start over; depending on how well my understanding of this all develops.Bounding boxes: I actually send an image with a red box around where the requested change is needed. And 8 out of 10 times it works well. But if it doesn't work, I use Claude to make the prompt more refined. The Claude API call that I make, can see the image + the prompt, as well understanding the layering system. This is one of the 3 ways I edit, there is another one where I just sent the prompt to Claude without it looking at the image. Right now this all feels like dial-up. With a minimum of 0.035$ per image generation (0.0001$ if I just use a LoRa though) and a minimum of 12-14 seconds wait on each edit/generation.
yard2010|3 months ago
Who has thought that we reach this uncharted territory with so many opportunities for pioneering and innovation? Back in 2019 it felt like nothing was new under the sun, today it feels like there is a whole new world under the sun, for us to explore!
Genego|3 months ago