top | item 46103288

(no title)

DevelopingElk | 3 months ago

RL before LLMs can very much learn new behaviors. Take a look at AlphaGo for that. It can also learn to drive in simulated environments. RL in LLMs is not learning the same way, so it can't create it's own behaviors.

discuss

order

No comments yet.