Hey folks, I’ve been working on using control-net to take in a video game level (input as a depth image) and output a beautiful illustration of that level. Play with it here: dimensionhopper.com or read the blog post about what it took to get it to work. Been a super fun project.
I wonder how far off we are from something like the battle school game from the book Ender's Game. That is, an immersive video game that uses player actions, choices, exploration etc. in order to generate not only new content, but entirely new game rules on the fly. It feels like we're getting closer and closer to Ender's holographic terminal with VR interfaces + AI content.
It’s not that hard to get these LLMs to generate new game rules, I’ve made some prototypes that use them to do that. The hard part is getting them to generate rules that are actually fun to play and somewhat balanced. The fact is that it’s hard to know if something is fun until you actually play it.
we are building something like that but the images are sadly static per story page and it has no psychological analysis. I've been interested in adding pupil dilation as a measure of cognitive load as a signal on one of the things you read but i think customers would find it too creepy.
While I have folks attention, I want to try training a model to generate monster/creature walk animations. Anyone know of a dataset of walk cycle sprite sheets that I could massage and label to see if I can make that work?
I have an idea for you to try - instead of training a model to produce subsequent animation frames (which is tough), instead, take a model trained on pixel art sprites in general, and then use a ControlNet with the input to the ControlNet being either a pose model or a higher res 3d model of a generic dummy character made in blender - and then generate output frame by frame, keeping the input prompting the same, but moving the ControlNet input frame by frame.
To get it down to small pixeled 'sprite' scale, the right thing may be to actually output 'realistic' character animation frames this way, and then 'de-res' them via img2img into pixel art. The whole pipeline could be automated so that your only inputs are a single set of varied walking/posing/jumping control net poses and the prompts describing the characters.
There are a lot of sprites to work with. As I'm sure you're aware, there are artists known for making animations, like Pedro Medeiros; spriters-resource.com has material from thousands of games; you can buy the Unity Asset Store, itch.io and stock art pixel art assets; and you can use DevX Tools Pro to extract assets from hundreds of 2D pixel art Unity games. All told, there are maybe 100,000-1m examples of high quality pixel art you can scrape. It is additionally possible that it already exists in the major crawls and needs to be labeled better.
A few people have tried training on sprite sheets and emitting them directly, and it did not work.
A few people have been working specifically on walking cycles, and it has a lot of limitations.
In my specific experience with other bespoke pixel art models, if you ask for a "knight," you're going to get a lot of the same looking knight. Fine-tuning will unlearn other concepts that are not represented in your dataset. LORAs have not been observed to work well for pixel art. You can try the Astropixel model, the highest quality in my opinion, for prototyping.
Part of this is you're really observing how powerful ControlNet, T2I-Adapters and LORAs are and you may have the expectation that something else you, a layperson, can do will be similarly powerful. Your thing is really cool. But is there some easy trick without doing all this science, for animation? No. Those are really big scientific breakthroughs, and with all the attention on video - maybe 100-1,000 academic and industry teams working on it - there still hasn't been something super robust for animation that uses LDMs. The most coherent video is happening in with NeRF, and a layperson isn't going to make that coherent with pixel art. Your best bet is to wait. That said, I'm sure people are going to link here to some great hand-processed LDM videos, and maybe there's a pipeline with hand artwork a layperson can do today that would work well.
I feel like Jump 'n Bump probably has a special place in the hearts of people who had access to the internet at a particular time. The internet was available, but multiplayer online gaming was still out of reach for many with it - there was an amazing niche of fairly polished indie local multiplayer games. Imagine being told while playing it back then what would be possible a few decades later.
amazing advances this year. Remember the guy who created the 2d platformer thats based on time, what was it called again? He spent around $100k+ just for the art, which I am pretty sure was a huge expenditure for him, with this software he could have done it virtually for free without much artistic talent at all.
The sad thing though is that $100k on art wasn’t wasted. It allowed an artist to make art and a living. I’m down the with tech and I’ve even written a typing game that generates Minecraft stories with a LLM and imagines them with stable diffusion for my daughter to learn typing with. But - my mom is an artist and she spent her career starving between sales and commissions. The fact the models have ingested their work and careers and can now replace them is sad.
On the other hand, I’ve never been an artist myself. So I’ve never been able to make my game ideas come true until now. The world is much more open to me in a creative side that my mechanical skills prevented.
Artists will continue to make art because it’s a compulsion. But I wish we had a world that was less oriented towards rewarding meaningless toil and would at least allow our born artists, writers, and creators the chance to do their obsessions to our benefit. Especially as we move post scarcity, I hope we can build a WPA like entity - perhaps, in a crazy twist, funded by AI?
At the quality of the current output, I think players still easily differentiate between AI generated art and hand-created art. Maybe in future versions this will be less noticeable.
As a game dev, I think at this stage AI can be a helpful utility, but it does not replace a designer's touch for professionally looking games.
I was going to comment that that the contrast between the beautiful illustrations and red blood that violently explodes when you kill the opponent is pretty funny. But then I looked up the original Jump 'n Bump and it's just as gory, if not more! Good ol' 90s games.
Love to see a write up on your Hugging Face Diffusers experience, setting that up, what your dev cycle & stack look like, if you're hosting that server on a GPU cloud instance or what. Those kind of details are very interesting.
I'm on an EC2 instance. Most of the effort was on cuda/pytorch/pip nonsense. Once stable diffusion webui was working, diffusers worked basically out of the box, which was really nice. (Trickiest bit was figuring out that I needed to use their tool to convert my safetensor file and that the version of python I was using wasn't working with it for some reason). Stack is flask + gunicorn which was what chatgpt recommended (lol). I had a websockets version of the progress bar working on flask-socketio on my local machine but could never get the server version through nginx to work correctly. So eventually I just gave up and switched it to polling so I could launch.
Does the 2d data like platforms and hit boxes still match the input entered by the human? If yes, I wouldn't say this is using AI for level editing, this seems using AI for level artwork generation. Impressive nonetheless, just different.
HN's submission title ("Show HN: Stable Diffusion powered level editor for a 2D game") made me think of the former. Article title ("2D Platformer using Stable Diffusion for live level art creation") was more accurate to me.
I've recently tried using InvokeAI to apply a specific style as a texture mod for the original Max Payne, along with RTX Remix. Instead of making the textures "modern", I was attempting to mix them into a noir rendition, similar to Sin City, but less cartoonish. Unfortuantely, it was really hard to get InvokeAI to restrict to the UV boundaries, with data always leaking and not looking really good when rendered.
I'm also curious if anyone has made a level that worked particularly well/poorly or has a great custom theme (that maybe I should add to the dropdown) :)
The main one is that making the control-net depth input look like something helps a ton. You can creating levels that have more 'structure' (large flat platforms, platforms that line up with others) and levels that are more random and see that the structure works way better. I played around a lot with turning the control-net on and off at the beginning and end of generation, which seemed to help when I was playing in the webui but then I didn't immediately find the API in diffusers and the results I was getting were great so I didn't keep looking.
Sure, but I'm more interested in things that were just impossible before. You can hire a artist to illustrate a level, or have AI do it cheaper, but you can't have an artist illustrate a level the player made while they wait. I think there are whole play patterns that are possible because the cost and especially speed of creating the art are many orders of magnitude different.
This is good for procedural generated 2D worlds. Think Hollow Knight, but expansive across infinite environments. Just randomly generate the control image and have the LLM generate the theme. Combine that with LLM generated lore and the possibilities are unlimited.
I have far more simpler (I imagine?) case already in mind, from the recent Cities discussion thread:
>> I would've expected at least a not grid-based zoning so that buildings on curves look more natural. All these empty pieces of land in between buildings look really bad and kind of force us to make grid cities. And that is not even an innovation, it was already present in the SimCity series. But some procedurally generated buildings for smooth corners and connecting buildings would be nice.
> Its hard to make assets that would work with every curve. When you see screenshots of nice cities like this, people are using mods to hand place assets with them clipping into each other to make a unified wall of buildings along the curve or corner.
ML generated building configurations for city builder games. Readily adaptable to any shape, and as a bonus can break up excessive repetition a bit. If you want to be ambitious, train a model on real-world aerial photos.
Yeah, in the map editor there is, in fact a random button that generates. I havn't gotten around to making sure that the random level is playable (and about 1 in 4 have unreachable areas) but that wouldn't be that hard to add. (I've been focused on the creative aspect of creating your own levels because right now that part is more fun).
[+] [-] jamesgreenleaf|2 years ago|reply
[+] [-] MattRix|2 years ago|reply
[+] [-] bemmu|2 years ago|reply
[+] [-] duringwork12|2 years ago|reply
[+] [-] robobenjie|2 years ago|reply
[+] [-] mk_stjames|2 years ago|reply
Something like how this posing works: https://www.youtube.com/watch?v=CiG_v61cLxI
To get it down to small pixeled 'sprite' scale, the right thing may be to actually output 'realistic' character animation frames this way, and then 'de-res' them via img2img into pixel art. The whole pipeline could be automated so that your only inputs are a single set of varied walking/posing/jumping control net poses and the prompts describing the characters.
[+] [-] doctorpangloss|2 years ago|reply
A few people have tried training on sprite sheets and emitting them directly, and it did not work.
A few people have been working specifically on walking cycles, and it has a lot of limitations.
In my specific experience with other bespoke pixel art models, if you ask for a "knight," you're going to get a lot of the same looking knight. Fine-tuning will unlearn other concepts that are not represented in your dataset. LORAs have not been observed to work well for pixel art. You can try the Astropixel model, the highest quality in my opinion, for prototyping.
Part of this is you're really observing how powerful ControlNet, T2I-Adapters and LORAs are and you may have the expectation that something else you, a layperson, can do will be similarly powerful. Your thing is really cool. But is there some easy trick without doing all this science, for animation? No. Those are really big scientific breakthroughs, and with all the attention on video - maybe 100-1,000 academic and industry teams working on it - there still hasn't been something super robust for animation that uses LDMs. The most coherent video is happening in with NeRF, and a layperson isn't going to make that coherent with pixel art. Your best bet is to wait. That said, I'm sure people are going to link here to some great hand-processed LDM videos, and maybe there's a pipeline with hand artwork a layperson can do today that would work well.
[+] [-] nightski|2 years ago|reply
[1] https://youtu.be/wAbLsRymXe4
[2] https://github.com/sebastianstarke/AI4Animation
[+] [-] psychphysic|2 years ago|reply
[+] [-] duringwork12|2 years ago|reply
[+] [-] nico|2 years ago|reply
Great example of using AI as a tool to make something exceptional
[+] [-] robobenjie|2 years ago|reply
[+] [-] tweetle_beetle|2 years ago|reply
(see also https://github.com/midzer/jumpnbump)
Now need someone to do other games of that time, place and genre: Tremor 3, C-Dogs, etc.
[+] [-] xyzal|2 years ago|reply
[+] [-] ionwake|2 years ago|reply
[+] [-] fnordpiglet|2 years ago|reply
On the other hand, I’ve never been an artist myself. So I’ve never been able to make my game ideas come true until now. The world is much more open to me in a creative side that my mechanical skills prevented.
Artists will continue to make art because it’s a compulsion. But I wish we had a world that was less oriented towards rewarding meaningless toil and would at least allow our born artists, writers, and creators the chance to do their obsessions to our benefit. Especially as we move post scarcity, I hope we can build a WPA like entity - perhaps, in a crazy twist, funded by AI?
[+] [-] drorco|2 years ago|reply
As a game dev, I think at this stage AI can be a helpful utility, but it does not replace a designer's touch for professionally looking games.
[+] [-] DeRock|2 years ago|reply
[+] [-] tltimeline2|2 years ago|reply
[+] [-] brucethemoose2|2 years ago|reply
Personal use, a mod, or a free experiment is one thing, but a shipping game is a different can of worms.
[+] [-] nagonago|2 years ago|reply
[+] [-] Agentlien|2 years ago|reply
[+] [-] bsenftner|2 years ago|reply
[+] [-] robobenjie|2 years ago|reply
[+] [-] thih9|2 years ago|reply
HN's submission title ("Show HN: Stable Diffusion powered level editor for a 2D game") made me think of the former. Article title ("2D Platformer using Stable Diffusion for live level art creation") was more accurate to me.
[+] [-] AlecSchueler|2 years ago|reply
[+] [-] jakearmitage|2 years ago|reply
[+] [-] porcc|2 years ago|reply
[+] [-] robobenjie|2 years ago|reply
[+] [-] erezsh|2 years ago|reply
[+] [-] robobenjie|2 years ago|reply
[+] [-] MagicMoonlight|2 years ago|reply
[+] [-] ilaksh|2 years ago|reply
[+] [-] mock-possum|2 years ago|reply
[+] [-] robobenjie|2 years ago|reply
[+] [-] spywaregorilla|2 years ago|reply
[+] [-] unknown|2 years ago|reply
[deleted]
[+] [-] fnordpiglet|2 years ago|reply
[+] [-] creativenolo|2 years ago|reply
[+] [-] Ireallyapart|2 years ago|reply
We have the technology to do this right now.
[+] [-] zokier|2 years ago|reply
>> I would've expected at least a not grid-based zoning so that buildings on curves look more natural. All these empty pieces of land in between buildings look really bad and kind of force us to make grid cities. And that is not even an innovation, it was already present in the SimCity series. But some procedurally generated buildings for smooth corners and connecting buildings would be nice.
> Its hard to make assets that would work with every curve. When you see screenshots of nice cities like this, people are using mods to hand place assets with them clipping into each other to make a unified wall of buildings along the curve or corner.
https://news.ycombinator.com/item?id=36294742
ML generated building configurations for city builder games. Readily adaptable to any shape, and as a bonus can break up excessive repetition a bit. If you want to be ambitious, train a model on real-world aerial photos.
[+] [-] robobenjie|2 years ago|reply
[+] [-] arijo|2 years ago|reply
[+] [-] 6510|2 years ago|reply
[+] [-] tltimeline2|2 years ago|reply
[deleted]