The abstract says ..we present metrics from our large-scale deployment of CodeCompose that shows its impact on Meta's internal code authoring experience over a 15-day time window, where 4.5 million suggestions were made by CodeCompose. Quantitative metrics reveal that (i) CodeCompose has an acceptance rate of 22% across several languages, and (ii) 8% of the code typed by users of CodeCompose is through accepting code suggestions from CodeCompose. Qualitative feedback indicates an overwhelming 91.5% positive reception for CodeCompose.In other terms, out of 4.5 million suggestions about 80% were off, yet there is 91% positive reception. That's 3.6 million rejected suggestions that potentially distracted programmers from doing their work. Yet users are happy. Is there a contradiction in these figures?
alan-stark|2 years ago
idiotsecant|2 years ago
moonchrome|2 years ago
- anticipate when the suggestions are likely to be useless and not even bother
- scan the proposals to see if they are what you want in cases it's useful
It's a boilerplate generator and you're happy when it saves you tedious mental effort.
unknown|2 years ago
[deleted]
pydry|2 years ago
On the other hand the person trying to track down a subtle bug afterwards might be a little less happy at having to wade through oceans of boilerplate.
fnordpiglet|2 years ago
Using GitHub copilot daily I find it’s suggestions often nonsense but interesting to see regardless. Often for boilerplate it’s spot on and it saves me dozens of lines of typing. But it also suggests stuff on every key stroke many of which I just type through, similar to intellisense. Assuming Metas code thingy is better, I would find myself in that 91%, as I’m already there with what’s available to the general public.
My only gripe, fwiw, with copilot in vscode is it interferes with intellisense. Often I want to see the code completion from both, but copilot jumps in before intellisense and the intellisense never renders and I use it as an inline api reference. Sometimes it’s so frustrating I have to turn off copilot. But, copilot is generally useful enough that I reenable it once I’ve understood the api stuff I’m unsure of. There’s some escape backspace period dance I can do that sometimes let’s intellisense win. I’ve not dug deeply enough into vscode configuration to know if there’s some parameter to tweak the race conditions. I’d note that when intellisense renders first copilot still renders its suggestions but the other way doesn’t work.
rychco|2 years ago
In this case it gave me 3 suggestions but I only accepted 1. I could see this taking 5-10 suggestions for an LLM to when it’s not something as straightforward as a function name. It’s still very useful despite this low acceptance rate
pavlov|2 years ago
layer8|2 years ago
Though I guess the success rates when using Stack Overflow aren’t too dissimilar.
visarga|2 years ago
cloudking|2 years ago
tablatom|2 years ago
I am well aware that others are having a different experience with it.
skybrian|2 years ago
> The final model was calibrated for a target precision of 50%. That is, we tuned the model and the suggestions filtering, so that 50% of suggested edits on our evaluation dataset are correct. In general, increasing the target precision reduces the number of shown suggested edits, and decreasing the target precision leads to more incorrect suggested edits. Incorrect suggested edits take the developers time and reduce the developers’ trust in the feature. We found that a target precision of 50% provides a good balance.
Also, it seems like if the suggestions are too good then they’ll be blindly trusted and if they’re too bad they’ll be ignored?
Where to set the balance likely depends on the UI. For a web search, how many results do you click on?
[1] https://ai.googleblog.com/2023/05/resolving-code-review-comm...
seanmcdirmid|2 years ago
afro88|2 years ago
AI code completion (like Github Copilot) is like this. Still a time saver overall, even with a low acceptance rate.
YetAnotherNick|2 years ago
Mountain_Skies|2 years ago
joshuamorton|2 years ago
meling|2 years ago
anigbrowl|2 years ago
6510|2 years ago