top | item 47188505

(no title)

vidarh | 3 days ago

> Wrong. They will use the model that gives them an edge. If they are using a PhD but their competitors are using Einstein, they will lose.

For some tasks that matters. But for a lot of tasks, "good enough but cheaper" will win out.

I'm sure there will be a market for whichever company has the best model, but just like most companies don't hire many PhD's, most companies won't feel a need for the highest end models either, above a certain level.

E.g. with the release of Sonnet 4.6, I switched a lot of my processes from Opus to Sonnet, because Sonnet 4.6 is good enough, and it means I can do more for less.

But I'm also experimenting with Kimi, Qwen, Deepseek, and others for a number of tasks, including fine-grained switching and interleaving. E.g. have a cheap but dumb model filter data or take over when a sub-task is simple enough, in order to have the smart model do less, for example.

discuss

intrasight|3 days ago

Models will get smarter and cheaper. For those that are burned directly into silicon, there will be a market for old models - as the alternative is to dump that silicon in a landfill.

For models that run on general-purpose AI hardware, I don't know why the vendors would waste that resource on old models.

vidarh|3 days ago

Who says anything about old models? What we're seeing is that as the frontier models get better, we get cheaper, better small models that leverage the advanced but cost a fraction. At the same time, hardware provides morez cheaper options. Sometimes far faster options too (e.g. Cerebras).

In terms of price, I can get 1m output tokens from Deepseek for 40 cents vs. 25 dollars for Opus, and a number of models near the 1-2 dollar mark that are increasingly viable for a larger set of applications.

Providers will keep running those cheaper models as long as there's demand.

generallyjosh|2 days ago

Larger models need more hardware resources to run

And, depending on effort settings, they do more 'thinking', i.e., use more rounds of inference to generate longer internal chains of thought

Both very good reasons to prefer a smaller model, if the small model is good enough for the task