top | item 44835383

(no title)

Ferrus91 | 6 months ago

This is using think time compute and reinforcement learning. I think this is going to plateau even faster than the initial LLM scaling though.

discuss

order

No comments yet.