top | item 43168558 (no title) zora_goron | 1 year ago Does anyone know how this “user decides how much compute” is implemented architecturally? I assume it’s the same underlying model, so what factor pushes the model to <think> for longer or shorter? Just a prompt-time modification or something else? discuss order hn newest No comments yet.
No comments yet.