Show HN: New AI edits images based on text instructions
1098 points| bryced | 3 years ago |github.com | reply
Here are some examples of transformations it can make: Golden gate bridge: https://raw.githubusercontent.com/brycedrennan/imaginAIry/ma... Girl with a pearl earring: https://raw.githubusercontent.com/brycedrennan/imaginAIry/ma...
I integrated this new InstructPix2Pix model into imaginAIry (python library) so it's easy to use for python developers.
[+] [-] sandworm101|3 years ago|reply
[+] [-] jagaerglad|3 years ago|reply
[+] [-] ricardobeat|3 years ago|reply
[+] [-] taberiand|3 years ago|reply
[+] [-] andrijeski|3 years ago|reply
[+] [-] PaulMest|3 years ago|reply
[+] [-] bryced|3 years ago|reply
[+] [-] nicbou|3 years ago|reply
[+] [-] bryced|3 years ago|reply
`aimg edit assets/girl_with_a_pearl_earring.jpg "make it pop" --prompt-strength 40 --gif`
https://user-images.githubusercontent.com/1217531/213912442-...
[+] [-] prox|3 years ago|reply
https://www.youtube.com/watch?v=GOwi3x92teo
;)
[+] [-] awestroke|3 years ago|reply
"Add lens flare"
"Increase saturation"
"Add sparkles and gleam"
[+] [-] TekMol|3 years ago|reply
[+] [-] tamrix|3 years ago|reply
[+] [-] perfrom1|3 years ago|reply
>> aimg edit input.jpg "make it pop" --prompt-strength 25
[+] [-] Gravyness|3 years ago|reply
Edit: Just noticed it is the same thing but wrapped, nevermind, pretty cool project!
[+] [-] bryced|3 years ago|reply
[+] [-] iuiz|3 years ago|reply
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. tensorflow 2.9.2 requires protobuf<3.20,>=3.9.2, but you have protobuf 3.20.3 which is incompatible. tensorboard 2.9.1 requires protobuf<3.20,>=3.9.2, but you have protobuf 3.20.3 which is incompatible.
[+] [-] cbeach|3 years ago|reply
---
First time I've used "colab" - looks great. Out of interest, who pays for the compute used by this?
Is it freely offerred by Google? Or is it charged to my Google API account when I use it? Or your account? It wasn't clear in the UI.
[+] [-] Tenoke|3 years ago|reply
[+] [-] Damirakyan|3 years ago|reply
[+] [-] Daub|3 years ago|reply
‘Decrease high-frequency features of background.’
‘Increase intra-contrast of middle ground to foreground.’
‘Increase global saturation contrast.’
‘Increase hue spread of greens.’
[+] [-] CyanBird|3 years ago|reply
Hopefully in a couple of years when things have matured more there will be more models capable of handling said requests
The most precise models are actually anime models because the users have got high standards for telling the machine what they expect of it and the databases are quite well annotated (booru tags)
[+] [-] GordonS|3 years ago|reply
[+] [-] b33j0r|3 years ago|reply
I want to be productive on this comment… but the crypto/cuda nexus of GPU work is simply not rational. Why are we still here?
You want to work in this field? Step 1. Buy an NVIDIA gpu. Step 2. CUDA. Step 3. Haha good luck, not available to purchase.
This situation is so crazy. My crappiest computer is way better at AI, just because I did an intel/nvidia build.
I don’t hate NVIDIA for innovating. The stagnation and risk of monopoly setting us back for unnecessary generations makes me a bit miffed.
So. To attempt to be productive here, what am I not seeing?
[+] [-] ColonelPhantom|3 years ago|reply
The setup.py file seems to indicate that PyTorch is used, which I think can also run on AMD GPUs, provided you are on Linux.
[+] [-] smallerfish|3 years ago|reply
[+] [-] bryced|3 years ago|reply
[+] [-] kadoban|3 years ago|reply
[+] [-] cbeach|3 years ago|reply
This stuff is fascinating, and @bryced's imaginAIry project made it accessible to people like me who never had any formal training in machine learning.
[+] [-] singhrac|3 years ago|reply
Note the level of investment that NVIDIA's software team has here: they have a separate WSL-Ubuntu installation method that takes care not to overwrite Windows drivers but installs the CUDA toolkit anyway. I expected this to be a niche, brittle process, but it was very well supported.
[+] [-] CptanPanic|3 years ago|reply
[+] [-] bobmaxup|3 years ago|reply
[+] [-] yieldcrv|3 years ago|reply
I’ll keep you posted how well this works for dating apps
[+] [-] sschueller|3 years ago|reply
This isn't running on a website that is open to everyone or can be easily run by a novice.
Anyone capable of installing and running this is also able to read code and remove such a feature. There is no reason to hide this nor to not document it.
Also the amount of nudity you get is also highly dependent on which model you use.
[+] [-] social_quotient|3 years ago|reply
I’ve been looking for an easier way to replace the text in these ai generated images. I found Facebook is working on it with their TextStyleBrush - https://ai.facebook.com/blog/ai-can-now-emulate-text-style-i... but have been unable to find something released or usable yet. Anyone aware of other efforts?
[+] [-] johndough|3 years ago|reply
[+] [-] TeMPOraL|3 years ago|reply
I'm on mobile so can't try this myself now. Can it add a Klingon bird of prey flying under the Golden Gate Bridge, and will "add a Klingon bird of prey flying under the Golden Gate Bridge" prompt/command be enough?
[+] [-] wongarsu|3 years ago|reply
1: https://i.imgur.com/gDj2Kn4.png
[+] [-] anigbrowl|3 years ago|reply
/Sighs in Intel iMac
Has anyone managed to get an eGPU running under MacOS? I guess I could use Colab but I like the feeling of running things locally.
[+] [-] perfopt|3 years ago|reply
[+] [-] bryced|3 years ago|reply
[+] [-] WesolyKubeczek|3 years ago|reply
[+] [-] c7b|3 years ago|reply
PS: I'm not trying to make a comic book, I'm trying to help a friend solve a far more basic business problem (trying to get clients to pay their bills on time).
[+] [-] bryced|3 years ago|reply
[+] [-] zepearl|3 years ago|reply
Works perfectly for me (Gentoo Linux + nVidia RTX3060 12GiB VRAM - I installed last week your package and it just worked, experimenting with it since then, telling about it parents & colleagues).
The results (especially in relation to "people's faces") can vary a lot between ok/scary/great (I still have to understand how the options/parameters work), all in all it's a great package that's easy to handle & use.
In general, if I don't specify a higher output resolution setting than the default (512x386 or something similar), with e.g. "-w 1024 -h 768", then faces get garbled/deformed like straight from a Stephen King novel => is this expected?
Cheers :)
[+] [-] karim79|3 years ago|reply
Our "cluster" is running on a ASUS ROG 2080Ti external GPU in the razer core-x housing, and that actually works just fine in my flat.
We went through several iterations of how this could work at scale. The initial premise was basically the google homepage, but for images.
That's when we realised that scaling this to serve the planet was going go be a hell of a lot more work. But not really, conceptualising the concurrent compute requirements as well as the ever-changing landscape and pace of innovation in this absolutely necessary.
The quick fix is to use a message queue (we're using Bull) and make everything asynchronous.
So essentially, we solved the scaling factor using just one GPU. You'll get your requested image, but it's in a queue, we'll let you know when it's done. With that compute model in place, we can just add more GPUs, and tickets will take less time to serve if the scale engineering is proper.
I'm no expert on GPU/Machine learning/GAN stuff but Stable Diffusion actually prompted me to imagine how to build and scale such a service, and I did so. It is not live yet, but when it does become so the name reserved is dreamcreator dot ai, and I can't say when it will be animated. Hopefully this year.
[+] [-] dandigangi|3 years ago|reply
[+] [-] dandigangi|3 years ago|reply
[+] [-] sebastiennight|3 years ago|reply
I was thinking of deploying something like that in one of our app features, but I'm scared of making our Users look like vampires :-)
Is it your experience that the model struggles more with faces than with other changes?
[+] [-] bryced|3 years ago|reply