The price of a token doesn't necessarily reflect the true cost of running a model.
After Claude Opus 4 released the price of OpenAIs o3 tokens where slashed practically over night.[0] If you think this happened because inference cost went down, I have a bridge to sell to you.
Generally I'm skeptical of the idea that any of the major providers are selling inference at a loss. Obviously they're losing money when you include the cost of research and training, but every indication I've seen is that they're not keen to sell $1 for 80 cents.
If you want a hint at the real costs of inference look to the companies that sell access to hosted open source models. They don't have any research costs to cover so their priority is to serve as inexpensively as possible while still turning a profit.
simonw|5 months ago
Generally I'm skeptical of the idea that any of the major providers are selling inference at a loss. Obviously they're losing money when you include the cost of research and training, but every indication I've seen is that they're not keen to sell $1 for 80 cents.
If you want a hint at the real costs of inference look to the companies that sell access to hosted open source models. They don't have any research costs to cover so their priority is to serve as inexpensively as possible while still turning a profit.
Or take a good open weight model and price out what it would cost to serve at scale. Here's someone who tried that recently: https://martinalderson.com/posts/are-openai-and-anthropic-re...