top | item 35155103

(no title)

Congrats on the launch. I think you should share some technical details for a more substantial pitch. You are using the OSS BigCode effort and "The Stack" [1, 2] (as you say in another comment), which is great.

A few questions that might help an enterprise customer: How big is your base model? Where did you find more datasets (maybe just a hint would be sufficient)? Are you using SantaCoder [3]? Anything you can say about your fine-tuning that makes it special? Totally on board with you that HumanEval/MBPP are not great benchmarks for real world, and do you have a suggested alternative to help me see the value?

The calculus for an enterprise customer might be: "We could fine tune a 6B model on our internal code and internal benchmarks (say with a month of work, a few thousand in compute, 2 people on task), but I'd rather buy an off-the-shelf solution like codecomplete.ai. They give us XYZ benefits." Articulate the XYZ for a technical decision maker who will be your target audience.

* [1] https://huggingface.co/datasets/bigcode/the-stack

* [2] https://arxiv.org/abs/2211.15533

* [3] https://huggingface.co/bigcode/santacoder

discuss

lumax15|3 years ago

Great questions. We want to keep some of our technical details closer to the chest, so I won't go into the specific technologies we're using here.

I will expand a bit on fine-tuning. It's really hard to get this right, and the iteration speed is slow. Of course these companies can build their own, but we want to save them a lot of headache.

So far, we haven't found any off-the-shelf open source base model that works super well for code completions. We've augmented models with a huge amount of data in order to see our current performance, and we ran into a lot of pain along the way.