top | item 36343309

(no title)

anonymoose4 | 2 years ago

I don't think your last paragraph is conceivable actually. Consider what's happening when you give it a prompt. It receives some set of tokens through the prompt and then takes a sample from a very accurately weighted distribution of its very large training set to generate the most likely next set of tokens. When you feed it back in an error in the code it generated, all it's doing is generating a new sample from a new set of input tokens. This time the tokens just include the error which tells it that the output it generated the first time was wrong in some way and it should weight its distribution differently. Usually the error message will contain the some of same tokens as the original lines of code that caused, which will have the effect of causing those particular lines to fall out of the distribution, and then the model will you give you some slightly different code that maybe doesn't fail this time.

It's still going to be subject to all the constraints of all ML techniques. Namely, it's really only capable of interpolating across its training set, not extrapolating from it. And it's going to fall on it's face whenever the distribution of the input data doesn't mirror the distribution of the training data. If it doesn't have lots of close-enough analogs to what you're trying to do in its training set such that the "right" code is tangibly represented in the distribution, you can feed errors back into it for infinity and it will never give you working code.

This might be controversial, but my belief is that if you find ChatGPT to be really good at doing your work for you, your work is likely closely mirroring a lot of code that exists publicly on the internet. Which would beg the question, is what you're doing not easily handled by just importing an existing library or something? Your application sounds relatively complex/niche, so maybe you are actually still doing a whole lot of higher level engineering and distilling your work down into trivial tasks that the model can handle.

discuss

order

brandall10|2 years ago

In regards to the last paragraph, I saw it happen again and again. Sometimes it wouldn't even be a directly obvious error, but an exception was thrown seemingly unrelated to the actual issue, a proximate cause. Of course that could be related to some SO post that is referenced, but who knows. There definitely is debate about emergent behavior in GPT-4 and it's not quite clear how some aspects work internally.

While it did create quite a few bugs and a couple random hallucinations... and sometimes it would update some piece of code without telling me other parts of the solution were updated (ie. choosing to change protobuf def without being asked to), it was able to fix everything even if I needed to give it a bit of help.

I don't think there is anything I'm doing that is truly novel in isolation, and it's not a lot of code (just a few hundred lines), but I doubt there's anything quite like it in totality out there.