top | item 42567458

(no title)

mmmore | 1 year ago

The comment was likely that there's no explicit search. In o1, the model has learned how to search using its context. Presumably they do this by RLing over long reasoning strings/internal monologues.

discuss

order

No comments yet.