I think a more individualistic definition of alignment could say that an AI that a person is directing doesn't do something that person does not desire - this definition removes the "foundational philosophy of what is good" problem, but does leave the "lunatic wants to destroy the world with AIs help" problem. Tricky times ahead
No comments yet.