top | item 30717773

Building games through natural language using OpenAI’s code-DaVinci model

205 points| dahjelle | 4 years ago |andrewmayneblog.wordpress.com

29 comments

kromem|4 years ago

This is one of those things that's so incredible and mind-blowing I really want to share it with friends or family, but WHY it is so impressive is locked behind a high enough sophistication that it would mostly be lost on them.

Having written a script a decade ago about a future in which software issues would be solved not by debugging or programming, but by finding the right way to communicate concepts to AIs, it's wild to see those nuances emerge.

One of the most interesting details in the post is the bit about asking for a function to create an array rather than the array itself.

Another was its existing 'semantic' (even illusory) knowledge of the Matrix rain.

It's going to be wild seeing this develop over the next few years. I'm sure we'll soon be seeing: specialized discriminators acting as code linters (even for human produced code), efforts at having GPT-3 write more modular instructions for Codex from generalized statements, and a recursive refinement as Codex plus the selection process of humans supervising it re-enters the open source dataset which will go on to train future iterations.

The thing it seems so many evaluating the tech right now overlook when predicting its future is the compounding rate of improvement as opposed to the more linear rates common across past technological parallels which relied on limited human resources.

qayxc|4 years ago

I am still greatly disappointed by the insistence on end-to-end black-box models.

In the end, as impressive as these results are, they are fundamentally trending in the wrong direction. All the benefits and certainty (e.g. security, correctness, and reproducibility) provided by theorem provers and model-driven systems are thrown out the window in favour of fast but potentially wrong or insecure results.

The worst part of this development is the psychological aspect - humans have a tendency to rely on machine generated results and view those as superior. The disconnect between working code and correct or secure code respectively is going to widen using this approach.

A glaring example is found in the blog post: the image manipulation example (7.) contains an error that the author failed to even recognise or mention. Instead of turning the uploaded image into a mosaic as intended, the generated code simply creates a fixed-size black-and-white checker-board pattern. This is clearly neither a mosaic nor image manipulation.

It is a very impressive tech demo, but generating actual software that can be trusted and rigorously checked against requirements will end up using a formal description (i.e. programming language, theorems, or modelling akin to UML) anyway.

bitforger|4 years ago

Interesting how you still need to have some intuitive sense of what's going on under the hood. (You can't say "make Zelda," you have to ask for an array of symbols and manipulate them.)

In that sense it feels like this is still programming, but at a higher level of abstraction with a weird fuzzy compiler. Now we can go from natural language -> JavaScript -> assembly etc. rather than just the last two.

Mediocre programmers use APIs, while good programmers know what's behind the curtain and can debug them. I suspect this will stay the same, no matter how many layers of abstraction we add.

ShamelessC|4 years ago

> Mediocre programmers use APIs, while good programmers know what's behind the curtain and can debug them. I suspect this will stay the same, no matter how many layers of abstraction we add.

The skill of both such categories (API developers and developers who use API's) is defined by the ability to know the _least_ amount of complexity needed for a given set of requirements. You may be appealing to some "deeper" sense of what it means to be a programmer, but in terms of what companies are willing to pay - if you get the same job done in a way that is easier to do in the future, you should be rewarded for that, because it saves your own time and the time of anyone who will need to work on that program in the future.

I think this is (only mildly) lacking in nuance. The ability to use AI for this task is surely limited at the moment - and people who know more about programming are certainly more capable of using these systems. As we go forward though, it's important to be able to admit that if an AI can produce a solution faster (and you have easy access to said AI, not a given), then you may be wasting time trying to "roll your own" in pursuit of being a good programmer.

On the other hand, until this AI-assisted experience is democratized, you're correct that it is a good idea to have engineers around who know this stuff from first principles. For now, I'm not terribly concerned that those folks will go away.

michaelbrave|4 years ago

Programming will never go away but those ease of use abstracted layers could bring in a new crowd so to speak. Much like how graphic design became much more common place and somewhat easier to learn once photoshop became common, experts still exist but you also get people in a garage making T-shirts now when before it wasn't much of a thing if that makes sense. An easy to use higher abstracted layer of coding could do similar, and create a new kind of less technical programmer class.

verisimi|4 years ago

Yes.

Could AI generated code mean the death of coding?

I'm wondering if, ultimately, you can get rid of the language as a part that you think about at all. Why not allow the AI to create a language that best suits it? Perhaps this would be hard to read for a human, but who cares? In fact, this would be a good thing for whoever owns the AI.

The issue then will become - as you say - to represent the problem well at a higher level of abstraction. Representation of the problem and knowing what a 'right' answer should be.

trh0awayman|4 years ago

This - along with GPT - are great ways to create originality detectors, something desperately needed.

The generators get all the attention, but we should be finding ways to use these as discriminators, so that we can find innovative and original projects.

I would love to get a list of Github repos or Steam games ranked on originality/chronologically. Things that are innovative within their own time. There are people making fascinating things, but it takes days, weeks, months to comb through the wreckage to find them.

I have no faith that these models will ever write Slaves to Armok 1 or Finnegans Wake or Dead Stars or original works in their own time - but I think detecting them might be within reach, which is far more useful currently (or at least within my lifespan).

I also think that human programming languages look cool for a demo - but ultimately, there should be programming languages that neatly interface with NNs or whatever - rather than pure text manipulation. I'm sure a lot of resources get sucked up into that alone, modeling syntax, etc. There needs to be a programming language that AI would use, probably directly manipulating an AST of sorts (unless I misunderstood this model, and it's already doing that).

Qworg|4 years ago

Original != Good - you'd need a discriminator for "goodness".

DantesKite|4 years ago

I'd be curious to see what the upper limit of this is. Could it for example, be trained to optimize video games? I think of the magic fast inverse square root optimization in Quake that dramatically reduced the cost of calculating angles.[1]

I bet there's all sorts of non-intuitive optimizations one could do in modern video games that are otherwise too tedious for most programmers to perform.

[1] https://en.wikipedia.org/wiki/Fast_inverse_square_root

Cthulhu_|4 years ago

> Could it for example, be trained to optimize video games?

In a sense, it already is; nVidia's DLSS [0] and AMD's FidelityFX are AI-driven technologies that allow games to be rendered at a faster, lower resolution, then using AI / ML technology to upscale it to HD or 4K resolutions without upscaling artifacts; the technology fills in the blanks based on a lower resolution frame. Apparently applying the AI upscaling is faster than rendering at full resolution.

[0] https://www.nvidia.com/nl-nl/geforce/technologies/dlss/

[1] https://www.amd.com/en/technologies/fidelityfx-super-resolut...

avaer|4 years ago

It can't do that from scratch yet; these kinds of optimizations require nontrivial mathematical understanding and informed judgement of trade-offs.

But it is capable of knowing your function is an inverse square root and inserting a known optimized version.

mark_l_watson|4 years ago

Cool article.

While I have been using GPT-3 via OpenAI’s APIs for about a half a year and I very much also appreciate using GitHub’s CoPilot because it saves me time, I wish for much more research into hybrid AI systems that are multi paradigm: deep learning, symbolic AI, new types of RL learning, breakthroughs in scaling conventional search, etc., etc.

There is so much work to get to the point where AI systems can effectively do counter factual reasoning, autonomously develop better models of the world, etc.

Symbolic AI as I learned it in the 1980s and deep learning in the last ten years are all great first steps, but we have a long way to go. Assuming parallel work in AI ethics, I don’t think there are any real limits on how much this technology can improve our lives.

adamgordonbell|4 years ago

I've been playing around with using gpt3 as a research assistant and it can work surprisingly well.

It's tricky to get the prompts right I think and you won't necessarily get novel insights, more like the distilled common wisdom of an area.

You can ask it to pretend to write the response to a subreddit. And you get an approximation of a subreddit filled with the type of experts you want, instantly answering your questions. Although they occasionally just spout non-sense.

cl42|4 years ago

Interesting.

Anyone have ideas as to why this works so well with JavaScript specifically? I tried to include similar commands in Python (i.e., use a prompt that implies Python based on commenting style) and it doesn't even write code, but instead keeps adding new comments.

unknown|4 years ago

[deleted]

ngcc_hk|4 years ago

Read the first page. Seem interesting but not interest in shooting game.

Given that game of life is … can it generate that abd sone if the patterns.

Also can it play go, chess, bridge … etc.

If not is it inherent or just this model.

Not a game developer and hence just question.

spupe|4 years ago

Spooky. Is there any existing tool that can do anything close to this at the moment?

Would have liked for the author to discuss a bit more the time spent optimizing the input, and his success rate.

monkeydust|4 years ago

Yea this. I remember doing some demos recently at work using OpenAI Codex and showcasing how easy it was to write SQL and Python given some natural language requirements.

The bit I didn't really say (I was working a particular angle!) was that I spent a fair bit of time on the prompt design. Changing a word here or there could lead to a drastically different outcome. Over time I got better at learning how to engineer the prompts so the code fulfilled my intention but it was a learning process for sure.

unknown|4 years ago

[deleted]

cleerline|4 years ago

is there anyway I can try this for myself? that is take the instructions and get the in quotes compiler to output the game

NiekvdMaas|4 years ago

It seems to be generated using this: https://beta.openai.com/codex-javascript-sandbox

stavros|4 years ago

I'd imagine so, OpenAI has a playground and API on their site.

unknown|4 years ago

[deleted]