top | item 45609116 (no title) anomaloustho | 4 months ago I wrote elsewhere but I’m more interpreting this distinction as “RL in real-time” vs “RL beforehand”. discuss order hn newest stevenpetryk|4 months ago This is referred to as “online reinforcement learning” and is already something done by, for example Cursor for their tab prediction model.https://cursor.com/blog/tab-rl tinodb|4 months ago Not sure that’s the same. They just very frequently retrain and “deploy a new model”. munchler|4 months ago I agree with this description, but I'm not sure we really want our AI agents evolving in real time as they gain experience. Having a static model that is thoroughly tested before deployment seems much safer. mbesto|4 months ago > Having a static model that is thoroughly tested before deployment seems much safer.While that might true, it fundamentally means it's not going to ever replicate human or provide super intelligence. load replies (1)
stevenpetryk|4 months ago This is referred to as “online reinforcement learning” and is already something done by, for example Cursor for their tab prediction model.https://cursor.com/blog/tab-rl tinodb|4 months ago Not sure that’s the same. They just very frequently retrain and “deploy a new model”.
tinodb|4 months ago Not sure that’s the same. They just very frequently retrain and “deploy a new model”.
munchler|4 months ago I agree with this description, but I'm not sure we really want our AI agents evolving in real time as they gain experience. Having a static model that is thoroughly tested before deployment seems much safer. mbesto|4 months ago > Having a static model that is thoroughly tested before deployment seems much safer.While that might true, it fundamentally means it's not going to ever replicate human or provide super intelligence. load replies (1)
mbesto|4 months ago > Having a static model that is thoroughly tested before deployment seems much safer.While that might true, it fundamentally means it's not going to ever replicate human or provide super intelligence. load replies (1)
stevenpetryk|4 months ago
https://cursor.com/blog/tab-rl
tinodb|4 months ago
munchler|4 months ago
mbesto|4 months ago
While that might true, it fundamentally means it's not going to ever replicate human or provide super intelligence.