top | item 32764447

Why Does This Horrifying Woman Keep Appearing in AI-Generated Images?

122 points| pseudolus | 3 years ago |vice.com | reply

112 comments

order
[+] scambier|3 years ago|reply
The woman appeared once, and they kept combining it with different prompts. So it stayed in the subsequent images.

https://twitter.com/supercomposite/status/156716492990339481... https://twitter.com/sheslostheplot/status/156737091948789350...

[+] vlunkr|3 years ago|reply
> I was ripping Loab apart, and putting her back together. She is an emergent island in the latent space that we don't know how to locate with text queries. But for the AI, Loab was an equally strong point of convergence as a verbal concept. And really, it was usually stronger!

The nonsense words of someone who really wants to be the creator of the next big creepypasta.

[+] philippejara|3 years ago|reply
thanks for the links, boggles the mind the article about the twitter thread and the images would not link to the twitter thread nor show more than a couple images. All the references in the vice article are only to other vice articles by the same writer...

The URL should probably link to the twitter thread in question, the articles adds pretty much nothing and removes tens of images and commentary by the creator.

[+] ddalex|3 years ago|reply
The interesting bit is not that it remains there, but that it seems to generate dark images (gore, violence) regardless of the input.
[+] godelski|3 years ago|reply
It is also important to note that they are specifically using features distant from one another (they are manipulating the latent and not using textual prompts to reproduce the face). It really does feel engineered to be creepy. Looking at the images closely it really does seem like a big reason for the creepiness is that there is a combined masculine and feminine features. That both are strong and since our brains are good at detecting human faces we instantly realize that something is off. Then there are the shadows on the face. The lighting is all wonky. Shadows are cast at weird angles. Either dark on the eyes compared to what should be seen or the opposite. Same goes for the mouth and chin area. Stable Diffusion and Midjourney aren't very good at human faces (specifically eyes, mouth, and skin textures), so that's going to add an extra creepy factor, especially combined with the negative weighting.

But you are spot on in noting that Vice is engaging in clickbait and misinformation. The tweets are purposefully being creepy because that is the art style of cryptids. But as your second link clearly shows, people aren't going to randomly find images like this (and let's be real, I wouldn't recognize this as the same person unless you told me they were). Stop engaging in melodrama Vice. You can't do that if you're claiming to report news.

[+] schaefer|3 years ago|reply
Is this just a new AI ghost story, or is there actually something scientifically repeatable here? Can we please get real nerdy about this?

The article broadly describes a series of prompts, but do we have enough information figure out which AI engine was used, reverse engineer some likely prompts, and try to produce similar results (not exactly the same, as that may not be possible with AI prompts)?

Is it even possible to ask an AI image generator to "produce the opposite" of a prompt?

Is this just an RTFM moment (for me)? or is "producing the opposite" a misunderstanding of how weights work. I only have experience in midjourney, but my understanding is that with midjouney you can weight various prompts as ratios. For example I can build up a prompt to generate an image that is 2 parts "autumn landscape" and 3 parts "birthday cake". But with ratios, isn’t it true that negative weights just discarding that prompt? They don’t produce “the opposite”, right?

[+] schaefer|3 years ago|reply
from Supercomposite's original twitter post, we can see they claim to start off with a negatively weighted prompt of "Brando::-1"

To me, this looks like midjourney prompts. At least we can say, that is valid midjourney syntax for sure, and it is using weights. but probably not to the effect the author portrays.

for example, if I prompt "/imagine autumn landscape::2 birthday cake::3", I get this image[1].

But now if I tweak the prompt to "/imagine autumn landscape::-0.5 birthday cake::3", I get this image [2].

Critically, there is no trace of "opposite of an autumn landscape" in this image. It's all birthday cake... 100%.

Indeed, this lines up with some midjouney documentation[3]. A negative weight will try to remove the thing in the prompt.

BUT: what happens when we only have one prompt, and we use a weight to negate it in midjourney. i.e: "/imagine Brando::-1". At least for me, it get this error:

"Invalid parameter

The sum of all of the prompt weights must be positive"

So I'm inclined to conclude that Supercomposite's post is more an act of creative story telling, than an accurate portrayal of their interactions with Midjourney.

But I am still left wondering if there is a true "opposite of" operator in any AI image generator.

[1]: https://cdn.discordapp.com/attachments/1006400576067739749/1...

[2]: https://cdn.discordapp.com/attachments/1006400576067739749/1...

[3]: https://midjourney.gitbook.io/docs/imagine-parameters#prompt...

[+] cyanydeez|3 years ago|reply
This is all ghost stories. The internet is a schizophrenic and is deeply into examining itself
[+] godelski|3 years ago|reply
> Can we please get real nerdy about this?

So as someone who works in generative modeling I'll give my best guess as to what is done and what is happening. It is a guess because they don't say everything, but there are some hints. Scambier linked these two twitter threads[0][1] which can give us some insight.

> I'll explain negative prompt weights, in case you don't know. With these, instead of creating an image of the text prompt, the AI tries to make the image look as different from the prompt as possible.

What's important here is that the machine doesn't actually know what the opposite is. In fact, I would ague we don't either. What you can do is use Lp distances from a latent representation. This is where things start to make sense. If in that we find faces as a large distance away, it is also unsurprising that we find many different facial characteristics. These first images look like there is a high mixture between strong masculine features and strong feminine features. These are not things we typically see in reality and combine with our hyperactive brains for recognizing other human faces, we enter the uncanny valley.

Next I don't know if this was done on purpose or not, but there are very clear issues with scene lighting. I can totally believe that this is not on purpose because this is something generators are bad at already. So we have shadows cast along the face in unnatural ways. Upping the creepiness factors.

Now we need to look at important features for recognizing faces: eyes, mouth, and nose. You may have noticed that text to image generators are typically really bad at these. Generators are also typically bad at facial symmetry (why we're trying to get transformers in, but this still isn't working to the degree we would like). In fact, I actually find it more interesting that these are coherent given the explanation of how the latent representation was created.

So I think we have good explanations as to why this would turn creepy very fast. Especially given the hype and that the creator is leaning into it. But these are my best guesses. I can't really know without seeing what is done.

But honestly, I am super interested and would like to see these latent representations and play around with them. This could be a good thing to investigate if you are trying to determine how smooth the latent manifold is, which is extremely important if we're going to make deeper content contributions and rely less on our prompt engineering. Maybe I'll have to play with some negative prompts (if I can find the time lol).

[0] https://twitter.com/supercomposite/status/156716228808747008...

[1]https://twitter.com/sheslostheplot/status/156737091948789350...

[+] bee_rider|3 years ago|reply
Just to be clear:

> “I can't confirm or deny which model it is for various reasons unfortunately! But I can confirm Loab exists in multiple image-generation AI models,” Supercomposite told Motherboard.

this is almost certainly a creepypasta which Vice is for some reporting as if it is real.

He probably found a creepy picture and then started using it as a seed or something like that.

[+] throwanem|3 years ago|reply
I'm surprised it's taken this long for someone to produce the first creepypasta of the image-generating AI era, if "era" is the right word. I'm not all that surprised Vice reported on it so credulously.
[+] NoToP|3 years ago|reply
I spent longer than I care to admit trying to figure out if "Loab" was some word play that Vice had not picked up on.
[+] plainnoodles|3 years ago|reply
So "why" is just "idk" and it seems like the person generating them found her on accident once, and then kept trying to find her again. I'm a fan of a good creepypasta and honestly a memetic SCP that "lives" in the models of AI sounds pretty tight to me. But this ain't it.

I'll admit I was expecting the "why" (aside from SCP...) to be something like "turns out DALL-E gets live humans and corpses mixed up and in macabre images the live humans get a bit more corpse-y due to this".

[+] bm3719|3 years ago|reply
If you like this, you'll want to see this AI-generated music video if you haven't already:

https://www.teddit.net/r/nextfuckinglevel/comments/x6d3c3/ai...

[+] krageon|3 years ago|reply
I clicked this thinking "how bad could it be?" and closed it as fast as I could. This is nightmare fuel, probably don't click it.
[+] kcplate|3 years ago|reply
Well that shit is gonna haunt me for a while…
[+] kthejoker2|3 years ago|reply
Thanks I hated it, but also it is lowkey amazing.

The fluidity of the generation seems way more "natural" than most CGI.

Definitely the way forward for every horror video game.

[+] xyzzy4747|3 years ago|reply
It’s almost like a cancer got into it and metastasized into all of the frames.
[+] ddalex|3 years ago|reply
is it the same artist / same woman ?
[+] somehnacct3757|3 years ago|reply
I asked DALL-E to draw a self portrait of itself and its family one time, and the results still make my hairs stand up.

DALL-E itself was Clippy-like, blue, and bean shaped with eyes and mouth more expressive than a Zuckerberg VR avatar.

As for the family? There was none. DALL-E rendered itself on a pure black background

[+] diegoperini|3 years ago|reply
Can you please share the image if that's okay?
[+] kingkawn|3 years ago|reply
The algorithm may be referencing Craiyon, the dall-e lite
[+] adamsmith143|3 years ago|reply
> I asked DALL-E to draw a self portrait of itself and its family one time, and the results still make my hairs stand up.

Dall-E and other Image generation models are not conscious, and they aren't even intelligent. Stop anthropomorphizing them, it's not helpful. We will likely face this problem for real in the coming decades but there's no sense doing so with current models.

[+] NoToP|3 years ago|reply
I asked craiyon for a self portrait. In two of the outputs it was showing a picture of someone holding someone else's portrait in front of their face.
[+] userbinator|3 years ago|reply
They are simply too big, and their results too unpredictable, to reliably prevent harmful results.

That seems like a surprisingly deep philosophical statement on society in general. And IMHO, along the same lines, "reliably prevent harmful results" is not something that should be pursued to exhaustion.

[+] nix0n|3 years ago|reply
The creators of the mainstream AI art generators went to great pains to prevent these tools from being used to generate pornography.

If you want to talk about statements on society in general, I think the fact that nudity is seen as more harmful than gore, is a bigger problem than an AI generating either one.

[+] usednet|3 years ago|reply
Loab looks remarkably similar to Billy the Puppet from Saw, which could explain its relation to gore and violence.
[+] ddalex|3 years ago|reply
Excellent identification ! Probably happens with every image with exacerbated cheek rosacea, triggers connections to blood too
[+] 4b11b4|3 years ago|reply
I was also thinking the girl from "The Ring".
[+] Octoth0rpe|3 years ago|reply
Maybe DALL-E just.. you know. Has a type.
[+] that_house|3 years ago|reply
Looks a bit like Toni Collette. She’s in a bunch of horror movies?
[+] nerdponx|3 years ago|reply
Were there screenshots of horror movies in the training data? That could explain a lot.
[+] dougmwne|3 years ago|reply
Looks like we discovered some kind of dark nexus in the collective subconscious. Img to text models could probably put words to this which would help us loop back and understand exactly what collective concept we were looking at.

Somone here on HN has said that they use these AI models to explore the latent space of the human imagination. I would say that sounds exactly correct.

[+] adamsmith143|3 years ago|reply
>Somone here on HN has said that they use these AI models to explore the latent space of the human imagination.

Bizarre. It's like saying that you are using the newly invented Bicycle to explore the concept of General Relativity. Woefully insufficient.

[+] fallingfrog|3 years ago|reply
Good lord, I didn't realize that an AI generated image could be so shockingly frightening. To be honest the fact that DALL-E can do the things it does is very worrisome for me. I think we shouldn't go down this technological path.
[+] blastonico|3 years ago|reply
Horrifying woman? I thought it was an Ozzy Osbourne picture.

EDIT: I'm not kidding.

[+] zh3|3 years ago|reply
It would not amaze me if you're spot on. Lots of images scraped off the 'net, especially celeb websites and quelle surprise.
[+] edent|3 years ago|reply
Crungus is another one which keeps coming up.

Perhaps there is some thing the machines can see, but our psychology protects us from?

[+] giantrobot|3 years ago|reply
See that is a good creeypasta concept.
[+] bitwize|3 years ago|reply
Clearly we are looking at a Keter-class infohazard.
[+] sidewndr46|3 years ago|reply
Anyone else not find this image disturbing at all? Looks like she is suffering from a severe skin condition. That's about it.
[+] ddalex|3 years ago|reply
I think this just tapped into our collective subconsciousness...
[+] dmarchand90|3 years ago|reply
If you play around with these tools the overwhelming majority of images are deeply disturbing. Melted faces, fused bodies, hands with unusual numbers of fingers. I think this machine produces objects deep in the uncanny valley.