top | item 28645965

(no title)

woko | 4 years ago

Keep in mind that the real goal here was to "test scalable alignment techniques", and OpenAI ended up adopting "recursive task decomposition".

That is basically a "divide and conquer" approach for simultaneously learning a complex task *and* allowing humans to evaluate it.

> To test scalable alignment techniques, we trained a model to summarize entire books. [...] This work is part of our ongoing research into aligning advanced AI systems, which is key to our mission. As we train our models to do increasingly complex tasks, making informed evaluations of the models’ outputs will become increasingly difficult for humans. [...] Our progress on book summarization is the first large-scale empirical work on scaling alignment techniques.

discuss

No comments yet.