top | item 47183911

(no title)

pksebben | 2 days ago

guidance and alignment are usually handled by RLHF, which actually rewires the weights such that it becomes near-impossible for the model to have certain kinds of 'thoughts'. This is baked in such that it's not something you can just extract or turn off.

discuss

order

No comments yet.