top | item 46103288 (no title) DevelopingElk | 3 months ago RL before LLMs can very much learn new behaviors. Take a look at AlphaGo for that. It can also learn to drive in simulated environments. RL in LLMs is not learning the same way, so it can't create it's own behaviors. discuss order hn newest No comments yet.
No comments yet.