top | item 20511459

How we built the Waifu Vending Machine

257 points| gwern | 6 years ago |waifulabs.com | reply

154 comments

order
[+] weeb_throwaway|6 years ago|reply
I am trying out the app at https://waifulabs.com/ and the art style is kind of one-note. Most of the expressions are the same and the face shape skews towards loli.

I am more into "disgusted anime girl that looks at you like you're trash" type and I couldn't find a waifu (even with their refinement steps).

Really impressed that this is even possible though!

[+] b_tterc_p|6 years ago|reply
I've played with it a few times. It seems like they have you choose features in the space in the order of base -> palette -> art style (loosely) -> pose, but have locked some emotion controlling vector to be happy. Probably a reasonable step for their audience.
[+] Thriptic|6 years ago|reply
Sounds like there needs to be more tsundere in the data set.
[+] gwern|6 years ago|reply
[+] hanniabu|6 years ago|reply
Looks like you're the creator so as some feedback it'd be nice to see a simple outline of the steps with "inactive" styling, the step you're currently on as bold (maybe with an arrow in from of it?), and as you go through it the previous steps have a checkmark appear next to it (and possibly go back to "inactive" styling).

The steps could be something like this: (1) Select character, (2) Select colors, (3) Select outfit, (4) Select pose

The reason I suggest this is I was a little confused on what the options were, what my future options would be, and how many options are left.

[+] userbinator|6 years ago|reply
Random semi-useful idea: use an SSH public key as input, giving a very memorable image for verification.

I could also see something like this having applications in https://en.wikipedia.org/wiki/Identicon generation.

[+] thanatropism|6 years ago|reply
“Anime girls” look all the same to me.

There was a time where I didn’t listen to rock music; Pantera and Green Day would sound the same. Memorable is up to cultural fluency...

[+] fragmede|6 years ago|reply
that's exactly (in ascii, that is) the idea behind randomart, eg

    The key fingerprint is:
    SHA256:s6N0OwlTDKjDez98kZRwUGZbTYaQUArv+EYC6sigFwA ben@eshwil
    The key's randomart image is:
    +---[RSA 2048]----+
    |E   ..o=*o.+o    |
    |.   .oo+oo...    |
    |....  o=..       |
    | o+. o  =        |
    |o .oo ooS.       |
    |* ...+o oo       |
    |oo.. o+o+o       |
    | .   o+o+o       |
    |      .o..       |
    +----[SHA256]-----+
(from https://blog.benjojo.co.uk/post/ssh-randomart-how-does-it-wo... )
[+] derefr|6 years ago|reply
The one thing I've always wanted from identicons is for the system to use a mixed generator composed of at least 20 different algorithms, that all produce results that are completely different-in-kind (like abstract shapes vs. cartoon monsters vs. anime girls vs. swatched spirographs vs. pixel cities vs. ...), such that it's unlikely that any two anonymous users in a smaller conversation will end up with identicons that you have to inspect to differentiate. (E.g. hopefully, if there are two users with identicons of the same style, then they would have very different base colors, which is impossible if all the identicons are of the same style.)
[+] mdorazio|6 years ago|reply
I'm actually really surprised no one did this sooner. I also wish they had posted revenue figures for the two days.

To other people doing this in the future: bring (or order) a fat battery pack with an AC outlet for $100 so you don't have to keep swapping laptops and can use a mobile hotspot all day.

[+] chendragon|6 years ago|reply
It looks like they did have these. In one of their pictures there was a stack of Anker PowerHouse power banks/battery packs.
[+] jdnenej|6 years ago|reply
If you can find a car charger for your laptop you will get much longer battery life as you don't have to do DC AC DC
[+] jpindar|6 years ago|reply
And never count on someone else's network working properly.
[+] b_tterc_p|6 years ago|reply
I would love to replicate this. It looks like the dataset is open source. https://www.gwern.net/Danbooru2018

I don't have a sense for hardware requirements though. Does anyone have a good idea of how much time and money it would take to train such a model?

[+] kevinfrans|6 years ago|reply
One of the creators here, the team at Sizigi and I are glad to answer any questions!
[+] ve55|6 years ago|reply
Great work! Do you have any thoughts to share on the future of this area? Anything specific with this project, future projects you might work on, or just the idea of profiting from AI-generated art to begin with?
[+] meruru|6 years ago|reply
Thanks for making this, it's awesome.

How much of a plagiarist I am if I make my characters using this and pretend they are original?

[+] meruru|6 years ago|reply
All that drawing and character design practice for nothing I guess.
[+] userbinator|6 years ago|reply
Don't forget that this would not have been possible if it weren't for the human creation of art. Machines cannot really think, but they can leverage and amplify.
[+] jchw|6 years ago|reply
These things are pretty damn impressive, but I would guess like self driving cars, we’re pretty far away from them displacing humans at the same task. It does seem like technology in this vein could be used to help the creative process, though on that note it’s only as good as its data set, which is of course something a human has to handle for now.

Even if robots replace human illustration in short order, it will probably never stop being a fun hobby, and I imagine neural networks were bound to be at least involved in the process at some point. People still draw on paper even though it’s hard to argue against the benefits of modern digital drawing.

[+] imtringued|6 years ago|reply
From what I've seen the end result contains a lot of artifacts. If I were in the market for a poster I'd still want a real artist to use the generated version as a rough sketch and redraw it properly.
[+] sb057|6 years ago|reply
PSA: I get a horrific 5 GB< memory leak immediately upon opening the tool's page.
[+] FrozenVoid|6 years ago|reply
Neat but lacks variety(all same pose and template) and too little steps to select. Resolution of final image is too low. I think same thing could be done with human images, if you can make it 7-10 selection steps(to pinpoint more fine features).
[+] mc32|6 years ago|reply
Lobster _is_ the neue Comic Sans... although maybe it’s just carrying over the tongue in cheek cheese further.
[+] tofof|6 years ago|reply
The dataset this is built from (https://www.gwern.net/Danbooru2018) is, simply put, copyright-infringing on a gross scale. The vast majority of the images uploaded to 'boorus' completely lack a compatible license or the artist's express consent. The redistribution via torrent of 2.5 TB of some 3 million images only compounds this problem. None of this is ameliorated by the $20 'generosity' of the dataset creator.

As a result, every single artist whose work was included in that dataset has a clear, meaningful claim that each and every 'waifu' sold ($20, if customized, or $5 if random) by Sizigi Studios is an infringing derivative work. Coupled with at least one of the project authors' ready admissions -- in this very comment section -- of scraping image sites himself, I would say that this team is playing with fire. Even in the case that an algorithm's output is somehow found to be 'creative' rather than mechanistic, AND this specific application is found to be in all cases substantially transformative, there's STILL the original massive 2.5 TB of copyright infringement up front to deal with.

All an enterprising lawyer would need to begin is to search the BigQuery metadata for the 'artist' and 'copyright' tags on these images. Note of course that the 'copyright' tag is widely misused on boorus and similar image repositories to refer to the inspiring franchise; 'trademark' would be much more accurate descriptor.

EDIT: I do not mean to suggest that litigation from the use of the dataset in this ML (as opposed to the original, clearly infringing, download & redistribution) would in any way be an easy, one-sided case --- only that this scenario would represent nearly the worst possible test case imaginable for determining the future legality of ML, short of directly antagonizing the RIAA or MPAA.

[+] thekevan|6 years ago|reply
I don't care about the how, why the hell would you?
[+] rootsudo|6 years ago|reply
This is normal per course for anime subculture today.
[+] meruru|6 years ago|reply
Some things just have to be done.
[+] 9nGQluzmnq3M|6 years ago|reply
The tech is interesting, but the concept and naming is pretty creepy. Waifu, from "wife", is anime slang for female cartoon characters that people get romantically attracted to.

https://www.dictionary.com/e/fictional-characters/waifu/

[+] _28jh|6 years ago|reply
Everyone knows that. It's also literally the point of the machine.
[+] flor1s|6 years ago|reply
I guess it's creepy in a similar way as people shooting other people virtually (in shooter video games), though a making a waifu "vending machine" adds another angle of objectification to it.
[+] mixedCase|6 years ago|reply
The usage nowadays is largely tongue in cheek, although it isn't h/a/rd to find places where they take it at face value.
[+] seanmcdirmid|6 years ago|reply
It isn’t really offensive, it’s just 4chanish, which will appeal to certain people and not others.
[+] Fomite|6 years ago|reply
Yeah...technically interesting, creepy AF.
[+] garbre|6 years ago|reply
Frankly, I refuse to believe so many people on HN have the kind of scruples about or lack of exposure to this part of AmerOtaku culture. As such, I believe many of the comments are simply 2nd-degree trolling.

Also, the post title is misspelled. It's "building", not "builing".

[+] hmahncke|6 years ago|reply
Walk up to the station, and you'll be greeted with a quick array of girls. After each step, the booth narrows your choices -- eventually leading to a final screen, where you can "adopt" the girl on the spot.

Your scientists were so preoccupied with whether or not they could, they didn't stop to think if they should.

[+] meruru|6 years ago|reply
The world isn't going to end because of war or famine or anything like that, but because humans will be too infatuated with their artificial partners to bother reproducing.
[+] karanlyons|6 years ago|reply
A lot of these look like children. Nowhere in the article is it mentioned that a lot of these look like children. No one in the comments has brought up that a lot of these look like children. A lot of these look like children.