top | item 41527645 (no title) deisteve | 1 year ago this is why i became skeptical of openai's claimsif they shared the COT the grift wont workits just RL discuss order hn newest falcor84|1 year ago I can't help but feel that saying "it's just RL" is like someone at the start of the 20th century saying "it's just electricity", as if understanding the underlying mechanism is the same as understanding the applications it can enable. dartos|1 year ago Tbf RL is pretty incredible.I trained a model to play a novel video game using only screenshots and a score using RL and I discovered how not to lose
falcor84|1 year ago I can't help but feel that saying "it's just RL" is like someone at the start of the 20th century saying "it's just electricity", as if understanding the underlying mechanism is the same as understanding the applications it can enable. dartos|1 year ago Tbf RL is pretty incredible.I trained a model to play a novel video game using only screenshots and a score using RL and I discovered how not to lose
dartos|1 year ago Tbf RL is pretty incredible.I trained a model to play a novel video game using only screenshots and a score using RL and I discovered how not to lose
falcor84|1 year ago
dartos|1 year ago
I trained a model to play a novel video game using only screenshots and a score using RL and I discovered how not to lose