(no title)
po | 2 months ago
- users want the best/smartest LLM
- the best performance for inference is found by spending more and more tokens (deep thinking)
- pricing is based on cost per token
Then the inference providers/hyperscalers will take all of the margin available to app makers (and then give it to Nvidia apparently). It is a bad business to be in, and not viable for OpenAI at their valuation.
littlestymaar|2 months ago
I think they all have become sufficiently good for most people to stick to what they are used to (especially in terms of tone/“personality” + the memory shared between conversations).