top | item 44729049

(no title)

sinenomine | 7 months ago

NLL loss and large-batch training regime inherently bias the model to learn “modal” representation of the world, and RLHF additionally collapses enthropy, especially as it is applied at most leading labs.

discuss

No comments yet.