top | item 42835737

(no title)

aero142 | 1 year ago

Are there any successful models that weren't trained with RLHF, or using a system with RLHF. I'm curious if this could be done without a fine tune step that would't meaningfully bias this.

discuss

No comments yet.