top | item 40402923

(no title)

They dont have any super intelligence, which is why they don't care about these teams

discuss

agucova|1 year ago

For context, the point of the Superalignment team was to work on a problem known as scalable oversight: the problem of aligning models in a way that holds up as models become more capable [1]. The reason behind this is that current alignment techniques (like RLHF), have limitations which are expected to worsen as models are scaled up [2].

This is to say, the objective of the Superalignment team was precisely to work on techniques that would work for models which don't yet exist. They are of course aware that they don't yet have superintelligence.

[1]: This paper by Anthropic is a good introduction to the problem; https://arxiv.org/abs/2211.03540

[2]: See, for example, Jan Leike's talk on this; https://www.youtube.com/watch?v=BtnvfVc8z8o

consumer451|1 year ago

Thank you very much for the informative non-hot-take context.