top | item 38615521

Concept sliders: LoRA adaptors for precise control in diffusion models

153 points| billconan | 2 years ago |sliders.baulab.info

57 comments

[+] barnabyjones|2 years ago|reply

So I suppose the main difference with using LoRAs normally is that less of the image is changed? Because it seems most of this could be done with inpainting and increasing LoRA strength, but I admit it would be difficult to keep facial features the same the way they seem to have done. What I notice are missing from these examples are actions/poses, so I wonder if it's good for that or we still need openpose/controlnet.

[+] dragonwriter|2 years ago|reply

> So I suppose the main difference with using LoRAs normally is that less of the image is changed?

The distinct thing about concept slider LoRA isn't how much of the image is changed (which will vary between LoRA within the type widely), but that the weight at which the LoRA is applied, rather than just setting how strongly the image tends toward representing a fixed concept, chooses which concept on a continuum the image tends towards.

> What I notice are missing from these examples are actions/poses, so I wonder if it's good for that or we still need openpose/controlnet.

You can do actions/poses via LoRA, but the control you get is qualitatively different than what you get with any of the controlnets, so its good to have both tools available. I haven't seen concepr slider LoRA specifically being used for poses, though conceptually doing something like seated<->standing or standing<->running as a concept slider would make some sense.

[+] yorwba|2 years ago|reply

These are LoRAs being used normally, and the level of modification is controlled by varying LoRA strength.

The main difference is in the training procedure, where they try to only target specific attributes while leaving others unchanged.

You could probably use this to create LoRAs for a specific pose, but if you want to try out many different poses and freely adjust them, the more flexible controlnet approach is likely to be more comfortable.

[+] neural_thing|2 years ago|reply

Ooh, I've been waiting for this. Exciting! Can't wait to try it out in a tool.

[+] simbolit|2 years ago|reply

Been using these for some weeks: https://civitai.com/search/models?sortBy=models_v5&query=Sli...

[+] verdverm|2 years ago|reply

I wonder if this can also work for text generation?

[+] edkennedy|2 years ago|reply

There are so many overlaps between diffusion and LLM, I wonder this kind of thing all the time!

[+] unknown|2 years ago|reply

[deleted]

[+] mpaepper|2 years ago|reply

Can you apply the sliders also to existing images that were not generated with SD?

[+] dragonwriter|2 years ago|reply

You can only apply a LoRA (concept slider or otherwise) to the model (well. model “family”) it is based on, not to an image directly, its a model modifier than adjusts model behavior, not an image filter.

You can use SD with image as an input in place of or in addition to a prompt, and you could apply one or more LoRA (concept slider or otherwise) to the model when doing so.

[+] z3phyr|2 years ago|reply

I always confuse the subject with the radio technology LoRaWAN

[+] devsda|2 years ago|reply

Had the same thought. It might already be known to those keeping track of AI tech but I wish the article mentioned what LoRA meant atleast once.

For those in the similar boat, it apparently means Low Rank Adaptation (for Large Language Models)

[+] unknown|2 years ago|reply

[deleted]

[+] itpcc|2 years ago|reply

Wait, how IoT radio thing related to AI stuff?

[+] stavros|2 years ago|reply

That's LoRa, not LoRA.

[+] UberFly|2 years ago|reply

"Artists spend significant time crafting prompts and finding seeds to generate a desired image with text-to-image models."

I wish there was a different description of those who generate art via AI than "artists". When ChatGPT writes me a story based on my prompts I'm still not an "author". Sorry, I know it's a tired argument already but it'll never stop bugging me.

[+] PeterStuer|2 years ago|reply

First of all being labeled an 'artist' has little relation to the tools used to create 'art'. 'Art' by 'artists' does inlude the Sistene Chapel amf the Matthew Passion, but also trowing a bucket of paint at a canvas, wrapping a bridge, having ice melt and dragging a urinal into a museum.

That does not make you an artist whenever you spill ketchup on your shirt, wrap a Chrismas gift or pour yourself a scotch on the rocks.

Second, while you can indeed just type a prompt into Midjourney and some images will be generated, there are 'artists' using diffusion models to meticulously craft 'art' using hundreds of itrrations combining dozens of models and loras with inpainting. outpainting, recomposing and fusing, dtawing and more to come to a unique piece of 'art'.

Third, didn't we have the same thing before? Are DJ's musicians? Some most definetly are. Creating new music by adapting, restructuring. remixing and extending existing tracks, while others use turntables as a real instrument. Others just put on some records at a wedding.

It's about intention and context, and yes, creativity and skill, not about the tools used.

[+] dataangel|2 years ago|reply

Artist is still going to be the right word with these kinds of tools because you're still adjusting the result to get the desired look. Even if it were just collage it would still be art.

[+] fzzzy|2 years ago|reply

I wish there was a different description of those who generate art via Word Processor than "artists". If they aren't impressing upon a cuneiform tablet I am not interested.

[+] minimaxir|2 years ago|reply

AI art and its modern tooling ecosystem is at a point now that it unironically takes more effort to generate an intentionally innovative AI image than it is to draw it from scratch.

Just look at ComfyUI: https://github.com/comfyanonymous/ComfyUI

[+] flessner|2 years ago|reply

This discussion might be heavy on semantics, but I don't think you are wrong for bringing this up.

AI can be used as a tool to empower authors and artists to create more and/or better works, but it can also be used by a lazy student to finish their 2 page writing assignment in seconds.

The only thing AI will clearly do over the next few years is raise the bar. Beginner art or copywriting is already outclassed and assignments will assume access to AI tools, similarly how many assume internet access nowadays.

Nevertheless, I truly believe that there will always be space for human art... just look at chess, with engines far better than humans, what are the tournaments about? the players.

[+] artninja1988|2 years ago|reply

Do you consider photographers artists? I really don't see why we should be gatekeeping the term either way

[+] lbeltrame|2 years ago|reply

While I won't ever call myself an artist, there are people who do not just click and get an image. My workflow involves getting the right prompt out, then manual touch up of the images (sometimes drastically so) using Krita and a Wacom tablet, or composing scenes with 3D mannequins from CLIP STUDIO PAINT and then feeding them to ControlNet (up to several CNs at once), generating and touching up again. I would argue there's a definite creative process here.

[+] iteygib|2 years ago|reply

I completely agree with you. I think AI art is a good tool, but referring to the users as "artists" is confusing to actual practice. You aren't creating the content, but rather are dictating it like a patron or client. You are just telling an engine what you hope to see, in the same way someone might tell you what to paint while watching over your shoulder.

It's also not a tired argument. You are just posting in a place that is very pro tech and not art-oriented. If you posted this in an art forum you'd get the opposite reaction.

[+] dragonwriter|2 years ago|reply

> I wish there was a different description of those who generate art via AI than "artists"

Typically, if you generate art, regardless of the tool or medium or form, you are an artist.

There may also be a more specific term for artists working with/in a specific tool, medium, or form, but the general term still applies.