top | item 39910424

(no title)

unit_circle | 1 year ago

A 1/8 chance of fixing a bug at the cost of a careful review and some corrections is not bad.

0% -> 12% improvement is not bad for two years either (I'm somewhat arbitrary picking the release date of ChatGPT). If this can be kept up for a few years we will have some extremely useful tooling. The cost can be relatively high as well, since engineering time is currently orders of magnitude more expensive than these tools.

discuss

order

golergka|1 year ago

It's still abysmal from POV of actually using it in production, but it's a very impressive rate of improvement. Given what happened with LLMs and image generation in the last few years, we can probably assume that these systems will be able to fix most trivial bugs pretty soon.

blharr|1 year ago

I still don't know. I feel like there are many ways where GPT will write some code or fix a bug in a way that makes it significantly harder to debug. Even for relatively simple tasks, it's kind of like machine-generated code that I would not want to touch.

WanderPanda|1 year ago

It is a bit worrisome but we manage to deal with subpar human code as well. Often the boilerplate generated by ChatGPT is already better than what an unexperienced coder would string together. I‘m sure it will not be a free lunch but the the benefits will probably outweigh the downsides.

Interesting scalability questions will arise wrt to security when scaling the already unmanagably large code bases by another magnitude (or two), though.

stefan_|1 year ago

These „benchmark“ are tuned around reporting some exciting result, once you look inside, all the „fixes“ are trash.

SrslyJosh|1 year ago

If someone submitted 8 PRs and 7 of them were bullshit, I would close anything else they submitted in the future without even bothering to review.