top | item 44956527

2026, Year of Reinforcement Learning?

10 points| namnnumbr | 6 months ago |aimlbling-about.ninerealmlabs.com

2 comments

I think the next big thing will be will actually be test time training. It will represent another unbelievable increase in compute but it will produce an even bigger jump than what thinking models provided.

Some food for thought is this: If you think AGI should dynamically learn and get better at arbitrary skills on the fly then LLMs + SGD is already a sort of slow moving AGI.

thtgrisdjdjdh|6 months ago

Works only for verifiable rewards, since humans (thankfully) don't have a good theory of knowledge (epistemology).

There's only so far that these agents can go.

unknown|6 months ago

[deleted]