top | item 34558841

(no title)

Kwantuum | 3 years ago

A lot of the comments seem to talk about the inevitable AI event horizon but unless I'm misreading this article the results are flat out bad. Even the 6 billion parameters model barely scratches a 50% success rate on a tiny problem that is trivial to fix for any human with basic knowledge of programming. Note the log scale of the graph.

discuss

hellodanylo|3 years ago

Yeah, I am also struggling to interpret the metrics in this post positively.

The 50% success rate is also best out of 3200 completions. For best out of 1 completion, the success rate is in low single digits.

I think the lesson here is that these models bring a lot more value when: 1. you have unit tests, 2. can afford compute/time to let the model try many solutions, 3. have enough isolation to run unverified code.

zaidhaan|3 years ago

They do note that the models "tend to do better when prompted with longer code generation tasks".

But yes, the choice of scales for the graph was rather peculiar.

kdnvk|3 years ago

6 billion is by no means large.