(no title)
dejv | 2 months ago
Well, lets see how all the economics will play out. LLMs might be really useful, but as far as I can see all the AI companies are not making money on inference alone. We might be hitting plateau in capabilities with money being raised on vision of being this godlike tech that will change the world completely. Sooner or later the costs will have to meet the reality.
Aurornis|2 months ago
The numbers aren’t public, but from what companies have indicated it seems inference itself would be profitable if you could exclude all of the R&D and training costs.
But this debate about startups losing money happens endlessly with every new startup cycle. Everyone forgets that losing money is an expected operating mode for a high growth startup. The models and hardware continue to improve. There is so much investment money accelerating this process that we have plenty of runway to continue improving before companies have to switch to full profit focus mode.
But even if we ignore that fact and assume they had to switch to profit mode tomorrow, LLM plans are currently so cheap that even a doubling or tripling isn’t going to be a problem. So what if the monthly plans start at $40 instead of $20 and the high usage plans go from $200 to $400 or even $600? The people using these for their jobs paying $10K or more per month can absorb that.
That’s not going to happen, though. If all model progress stopped right now the companies would still be capturing cheaper compute as data center buildouts were completed and next generation compute hardware was released.
I see these predictions as the current equivalent of all of the predictions that Uber was going to collapse when the VC money ran out. Instead, Uber quietly settled into steady operation, prices went up a little bit, and people still use Uber a lot. Uber did this without the constant hardware and model improvements that LLM companies benefit from.
mtone|2 months ago
LLMs have a short shelf-life. They don't know anything past the day they're trained. It's possible to feed or fine-tune them a bit of updated data but its world knowledge and views are firmly stuck in the past. It's not just news - they'll also trip up on new syntax introduced in the latest version of a programming language.
They could save on R&D but I expect training costs will be recurring regardless of advancements in capability.
Workaccount2|2 months ago
I'm not gonna dig out the math again, but if AI usage follows the popularity path of cell phone usage (which seems to be the case), then trillions invested has a ROI of 5-7 years. Not bad at all.
blks|2 months ago
iLoveOncall|2 months ago
ImprobableTruth|2 months ago
daveguy|2 months ago
mNovak|2 months ago
20k|2 months ago
Having good quality dev tools is non negotiable, and I have a feeling that a lot of people are going to find out the hard way that reliability and it not being owned by profit seeking company is the #1 thing you want in your environment
NitpickLawyer|2 months ago
This was the missed point on why GPT5 was such an important launch (quality of models and vibes aside). It brought the model sizes (and hence inference cost) to more sustainable numbers. Compared to previous SotA (GPT4 at launch, or o1/3 series), GPT5 is 8x-12x cheaper! I feel that a lot of people never re-calibrated their views on inference.
And there's also another place where you can verify your take on inference - the 3rd party providers that offer "open" models. They have 0 incentive to subsidise prices, because people that use them often don't even know who serves them, so there's 0 brand recognition (say when using models via openrouter).
These 3rd party providers have all converged towards a price-point per billion param models. And you can check those prices, and have an idea on what would be proffitable and at what sizes. Models like dsv3.2 are really really cheap to serve, for what they provide (at least gpt5-mini equivalent I'd say).
So yes, labs could totally become profitable with inference alone. But they don't want that, because there's an argument to be made that the best will "keep it all". I hope, for our sake as consumers that it isn't the case. And so far this year it seems that it's not the case. We've had all 4 big labs one-up eachother several times, and they're keeping eachother honest. And that's good for us. We get frontier level offerings at 10-25$/MTok (Opus, gpt5.2, gemini3pro, grok4), and we get highly capable yet extremely cheap models at 1.5-3$/MTok (gemini3-flash, gpt-minis, grok-fast, etc)
unknown|2 months ago
[deleted]
nl|2 months ago