Hey everyone! We've been working on helping eng teams finetune custom code LLMs for their specific internal code repos for different tasks across the SDLC.
We wrote a blog post about how we're doing it for code completions. We essentially fine-tune the model as a developer going from a blank slate to the full repo, one diff at a time. Instead of treating codebases as a static, raw list of files, we treat them as time-series of diffs on graphs of code objects (functions, classes, etc.).
The results are very encouraging.
Would love to answer questions and hear any cool ideas y'all might have!
[+] [-] kumama|1 year ago|reply
We wrote a blog post about how we're doing it for code completions. We essentially fine-tune the model as a developer going from a blank slate to the full repo, one diff at a time. Instead of treating codebases as a static, raw list of files, we treat them as time-series of diffs on graphs of code objects (functions, classes, etc.).
The results are very encouraging.
Would love to answer questions and hear any cool ideas y'all might have!