(no title)
srdjanr | 2 years ago
From my local test on a 13B model, output tokens are 20-30x more expensive than input tokens. So OpenAI's pricing structure is based on expectation that there's much more input than output tokens in an average response. It didn't matter too much if a small percentage of requests used all 4k tokens for output, but with 128k it's a different story.
No comments yet.