(no title)
acituan | 2 months ago
User has two knobs called the thinking level and the model. So we know there are definitely per call knobs. Who can tell if thinking-high actually has a server side fork into eg thinking-high-sports-mode versus thinking-high-eco-mode for example. Or if there were two slightly different instantiations of pro models, one with cheaper inference due to whatever hyperparameter versus full on expensive inference. There are infinite ways to implement this. Zero ways to be proven by the end user.
No comments yet.