(no title)
Lerc | 20 hours ago
To get the predicted disastrous effects you need to be doing function optimisation without regard to the meaning of the function parameters. Yes, models can still game the system at inference time, but in much the same way as a human might game the system, it requires awareness that you are going against the intent of some rule.
No comments yet.