top | item 47169669

(no title)

euclaise | 4 days ago

Maybe RL? Just like similar corrections in reasoning traces. You can train non-'thinking' models the same way (though if you're naive about it then you might end up with responses that are similarly rambly), and I'd expect it to have been

discuss

No comments yet.