(no title)
ivanbelenky | 4 months ago
There seems to be 2 types
- Specification failure: signal is bad-ish, a completely broken behavior --> local optimal points achieved for policies that phenomenologically do not represent what was expected/desired to cover --> signaling an improvable reward signal definition
- Domain constraint failure: signal is still good and optimization is "legitimate", but you are prompted with the question "do I need to constraint my domain of solutions?"
- finding a bug that reduces time to completion of a game in a speedrun setting would be a new acceptable baseline, because there are no rules to finishing the game earlier
- shooting amphetamines on a 100m run would probably minimize time, but other factors will make people consider disallowing such practices.
Eisenstein|4 months ago
This is of course inevitable if the goal cannot be directly measured but is composed of many constantly moving variables such as education or public health.
This doesn't mean we shouldn't bother having such goals, it just means we have to be diligent at pivoting the incentives when it becomes evident that secondary effects are being produced at the expense of the desired effect.
godelski|4 months ago
I agree with you, this doesn't mean we shouldn't bother with goals. They are fantastic tools. But they are guides. The better aligned our proxy measurement is with the intended measurement then the less we have to interpret our results. We have to think less, spending less energy. But even poorly defined goals can be helpful, as they get refined as we progress in them. We've all done this since we were kids and we do this to this day. All long term goals are updated as we progress in them. It's not like we just state a goal and then hop on the railroad to success.
It's like writing tests for code. Tests don't prove that your code is bug free (can't write a test for a bug you don't know about: unknown unknown). But tests are still helpful because they help evidence the code is bug free and constrain the domain in which bugs can live. It's also why TDD is naive, because tests aren't proof and you have to continue to think beyond the tests.
[0] https://news.ycombinator.com/item?id=45555551