(no title)
byefruit | 10 months ago
This implies it's not a hybrid model that can just skip reasoning steps if requested.
Anyone know what else they might be doing?
Reasoning means contexts will be longer (for thinking tokens) and there's an increase in cost to inference with a longer context but it's not going to be 6x.
Or is it just market pricing?
vineyardmike|10 months ago
It’s smart because it gives them room to drop prices later and compete once other company actually get to a similar quality.
jsnell|10 months ago
It clearly is, since most of the post is dedicated to the tunability (both manual and automatic) of the reasoning budget.
I don't know what they're doing with this pricing, and the blog post does not do a good job explaining.
Could it be that they're not counting thinking tokens as output tokens (since you don't get access to the full thinking trace anyway), and this is the basically amortizing the thinking tokens spend over the actual output tokens? Doesn't make sense either, because then the user has no incentive to use anything except 0/max thinking budgets.
RobinL|10 months ago
coder543|10 months ago
[0]: https://x.com/OfficialLoganK/status/1912981986085323231
unknown|10 months ago
[deleted]