top | item 38714962

(no title)

pyduan | 2 years ago

Do you have an estimate for typical token usage for a developer who'd use it as part of their workflow? I'd imagine the costs can rack up fairly quickly if you're not careful.

discuss

fgfm|2 years ago

We'll do our best to consistently report it since this can indeed influence the financial decisions of developers, especially if they go through third-party paying LLM APIs. In our early experiments, we've seen about 200-250 tokens per request (~= autocompletion), of which about 40-50 tokens are generated.

Two things we're doing this:

- right now our API response contains more than what's required for autocompletion, so there is room for improvement there. And since we focus on team alignment, the goal is to boost the suggestion acceptance rate compared to alternatives. So in the end, fewer calls and lower token consumption.

- since we're working on fully migrating to hostable OSS models of reasonable size, the financial aspect of token consumption should be mostly moved out of the picture to focus on latency.