top | item 42567458 (no title) mmmore | 1 year ago The comment was likely that there's no explicit search. In o1, the model has learned how to search using its context. Presumably they do this by RLing over long reasoning strings/internal monologues. discuss order hn newest No comments yet.
No comments yet.