top | item 34704890

(no title)

net_ | 3 years ago

I've thought a bit about how I would add text-based procedural generation to my virtual world engine (I would love to have an "edit a book to change the world" experience, like Myst's linking books). The hard part is the composition of smaller things into larger concepts, since you have limited data.

For example, developers often devise a system of generic water tiles that they can fit together to form any sort of river. But how do you algorithmically go from the text "I want a river that starts large, gets narrow, then turns East" to a valid map layout? The water tiles and the ways they fit together are specific to the world, so there's nowhere near enough data to train a model. You could hardcode procedural logic based on key phrases ("river" + "start large" + "get narrow" + "turn east"), but that's a lot of work for a very limited implementation.

The most promising idea I've come up with is a node-based intermediate representation. You could use general data on rivers to determine how they're shaped, then use that to translate the user's prompt into a series of nodes on a map. Then, you just need to hand-implement a system to iterate the nodes and place tiles based on whether the next node is straight ahead, turning, etc. This approach could generalize to other concepts like mountains. At that point, it's just about making the text-based stage sufficiently predictive of what the user wants instead of just being a Logo-like.

discuss

vintermann|3 years ago

The incremental-procedural generation that impresses me most is long distance connections.

We want worlds that are too big to be completely stored, so there's conceptually a tree, where high-level features (continents? cities?) exist on the top, and low-level features (the colour of the fur of a specific citizen's pet?) towards the bottom. To get the same tree, no matter the order you expand it, is the key desirable feature of incremental procedural generation. Because then you get the feeling that it's in a sense "all there", and not just made up as you go along.

But making a tree like that is easy. The hard part is the long-distance connections. If you step down the tree to a continent, to a country, to a city, to a person... and you want that guy to have an aunt. Possibly on another continent. And you want to make it so that if you restarted with the same seed, and stepped down to the same aunt, she would still have that same nephew.

Most incremental procedural games have very little in the way of long distance connections, and it makes them lose their lustre quickly. What does it matter if there are 10000 procedurally generated cities in your world, if they don't have anything to do with each other?

A few games get by this by not being incremental, but simulating all that hard stuff up front. That's how they have aunts in Dwarf Fortress. But while it's impressive, it doesn't scale up. I think the way forward must be to figure out how to embellish (that is, expanding nodes in a seed-generated tree) just what you need, when you need it.

Language models are literally good at embellishing, but then you feel that lack of consistency and coherence, because there's no seed-generated tree at the root of it.

voidmain|3 years ago

I think I can solve your puzzle.

First assign the number of potential aunts and nephews in each continent, city, etc top down in your tree (ensuring that the numbers are equal at the top and that you can reversibly assign a compact integer to each aunt and to each nephew). Then you need an invertible pseudorandom permutation of the right size P, i.e P(P^-1(x))=x for 0<=x<N. For nephew n the aunt is a=P(n) and you can walk down the tree to generate the details of the entity which is aunt a. If you start with the aunt you just use P^-1.

Cryptography has already developed families of functions P. For N=2^128, AES is such a family. For non power of two N, you want to look at "format-preserving encryption". For example you can use a Feistel network to permute the next larger power of two and simply iterate that permutation until you get a result in range.

I think you could extend this primitive to nonuniform graphs (aunt more likely in the same city) and more kinds of relationships. You can also use rejection sampling to condition the relationships further and probably to deal with the top down population assignment being approximate.

All that said, I think procedural generation currently runs out of creativity long before you would run out of memory to store explicit relationship indices. I think it's a lot more hopeful to get more creativity out of generative AI while keeping things grounded by explicitly representing the world, even if it "doesn't scale" to creating trillions of galaxies.