top | item 42048094

(no title)

ilidur | 1 year ago

Review: the document starts strong with a methodology and numbers. It covers 3 approaches: Copilot code assistance, Llama3 fine-tuning on their codebase, and RAG on documentation. The first one is the only one supported by numbers, with 27% of code suggested being accepted by developers. Although they set up a control group they fail to relate the LLM findings to it.

Fine-tuning is suggested to improve jobs like tooling upgrades but no concrete numbers are offered.

Lastly RAG on documentation. The RAG has a simple system prompt to improve uncertain responses. They're tracking meeting and support requests but don't show any results. They mention frustration with nonsensical answers but use a RL human feedback technique to improve responses. No numbers offered.

Overall a simple overview of what they tried but the strong methodological start doesn't get reflected in the numbers reported later on.

discuss

No comments yet.