(no title)
charleshn | 5 months ago
But generally the idea is that it's, you need some notion of reward, verifiers etc.
Works really well for maths, algorithms, amd many things actually.
See also this very short essay/introduction: https://www.jasonwei.net/blog/asymmetry-of-verification-and-...
That's why we have IMO gold level models now, and I'm pretty confident we'll have superhuman mathematics, algorithmic etc models before long.
Now domains which are very hard to verify - think e.g. theoretical physics etc - that's another story.
skeezyboy|5 months ago
i dont think youre getting the point hes making.