top | item 46056003

(no title)

highfrequency | 3 months ago

Great respect for Ilya, but I don’t see an explicit argument why scaling RL in tons of domains wouldn’t work.

discuss

order

never_inline|3 months ago

I think that scaling RL for all common domains is already done to death by big labs.

kubb|3 months ago

Not sure why they care about his opinion and discard yours.

They’re just as valid and well informed.

anthonypasq|3 months ago

doesnt RL by definition not generalize? thats Ilya's entire criticism of the current paradigm