top | item 37994207

(no title)

drschwabe | 2 years ago

ControlNet model specifically the scribble ControlNet (and ComfyUI) was major gamechanger for me.

Was getting good results with just SD and occassional masking but it would take hours and hours to hone in and composite a complex scene with specific requirements & shapes (with most of the work spent currating the best outputs and then blending them into a scene with Gimp/Inkscape).

Masking is unintuitive compared to the scribble which gets similar effect; no need to paint masks (which is disruptive to the natural process of 'drawing' IMO) instead just make a general black and white outline of your scene. Simply dial up/down the conditioning strength to have it more tightly or fuzzily follow that outline.

You can also use Gimp's Threshold or Inkscape Trace Bitmap tool to get a decent black & white outline from an existing bitmap to expedite the scribble procedure.

discuss

fsloth|2 years ago

Comfy ui is really nice. The fact that the node graph is saved as png metadata actually makes node based workflows super fluent and reproducible since all you need to do to get the graph for your image is to drag and drop the result png to the gui. This feels like a huge quality-of-life improvement compared to any other lightweight node tools I’ve used.

drschwabe|2 years ago

Yeah the PNG embedded 'drag and drop to restore your config' is brilliant.

Reminds me of Fireworks which Adobe killed off (after putting out a decent update or two to be fair) which used PNGs for layers and meta ala PSD format.

But its more analogous to a 3D modeller suite like Blender or Maya but with theoretical feature such that you could take a rendered output image and dragndrop it back into the 3D viewport and have it restore all the exact render settings you used instantly back. That would be handy!

l33tman|2 years ago

You don't need to go through Gimp or Inkscape, this is built-in to the auto1111 ControlNet UI. You just dump the existing photo there and you can select a bunch of pre-processors like edge-detection or 3D depth extraction, which is then fed into ControlNet to generate a new image.

This is super powerful for example visualizing the renovation of an apartment room or house exterior.

drschwabe|2 years ago

Will have to play with those more thx for the headsup; I do find however for scribble outlines I like to often draw my own lines by hand instead of an auto-generated one to emphasize the absolute key areas that would not otherwise be auto-identified. Logo and 2D design for example where you may have very specifc text shape that needs be preserved regardless of contrast or perceivable depth.

gkeechin|2 years ago

That's for sure - I think we have seen other kind of edge detector or filter work better for differing use cases, especially around foreground images where you want to retain more information (i.e. images with small nitty-gritty details)

In this post, we just seek to showcase the fastest way to do it - and how augmentation may potentially help vary the position!

moelf|2 years ago

any tutorial you would recommend? I found https://comfyanonymous.github.io/ComfyUI_examples/controlnet...

drschwabe|2 years ago

Yeah that tutorial is decent its what I used to get going.

Note that all of the images in those comfy tutorials (except for images of the UI itself) can be dragndropped into ComfyUI and you'll get the entire node layout you can use to understand how it works.

Another good resource is civit.ai and specifically look for images that have a comfy UI embedded metadata. I made a feature request that they create a tag for uploaders to flag comfyUI pngs but not sure if they've added that yet. Or caroose Reddit or Discord for people sharing PNGs with comfy embeds.

Trying out different models (also avail from civit) is a good way to get an understanding of how swapping out models affects performance and the results. I've been abusing Absolutereality (v1.81) + More Details LORA because its just so damn fast and the results are great for almost any requirement I throw at it. AI moves so fast but I don't even bother updating the models anymore there is just so much potential in the models we already have; more pay off would be mastering other techniques like the depth map Control Net.

I would say that above all extensive familiarity with an image editor like Photoshop, Gimp, or Krita - will get you the most mileage particularly if you have specific needs beyond just fun and concepting. AI art makes artists better, people who struggle with image editing will struggle to maximize this new tech just as people who struggle with code will have issues maintaining the code Copilot or ChatGPT is spitting out (versus a coder who will refactor and fine tune before integrating to the rest of their application).

shostack|2 years ago

Is there any solution for consistency yet that goes beyond form and structure and gets things like outfits, color, and facial features consistent in an easy way to compose scenes with multiple consistent characters?

dragonwriter|2 years ago

LoRA for specific items/characters + regional prompting covers a lot of that area.