(no title)
foo3a9c4 | 2 years ago
Okay. So we agree that (A) powerful systems can best weaker systems in ways that are unexpected to the weaker system, and (B) it is possible that AGI poses an existential risk to humanity.
> The negation of the claim "AGI poses an existential risk to humanity" is "AGI doesn't necessarily pose an existential risk to humanity".
It seems to me that the negation of your first claim is just "AGI doesn't pose an existential risk to humanity". Is "necessarily" doing some important work in your second claim?
>> https://wiki.aiimpacts.org/doku.php?id=arguments_for_ai_risk...
> The argument here works just as much for single-minded humans, so it's quite moot.
I don't understand why the argument being applicable to humans would make it moot. Please explain.
>> https://aiadventures.net/summaries/agi-ruin-list-of-lethalit...
> This seems to agree with my previously stated positions. It does try to establish a canonical argument, as you say, but then it goes on to explain why they don't think it's persuasive.
Is there a particular premise or inferential step in the blog's argument that you believe to be mistaken? (I've copied the argument below.)
P1: The current trajectory of AI research will lead to superhuman AGI.
P2: Superhuman AGI will be capable of escaping any human efforts to control it.
P3: Superhuman AGI will be misaligned by default, i.e. it will likely adopt values and/or set long-term goals that will lead to extinction-level outcomes, meaning outcomes that are as bad as human extinction.
P4: We do not know how to align superhuman AGI, i.e. reliably imbue it with values or define long-term goals that will ensure it does not ultimately lead to an extinction-level outcome, without some amount of trial & error (how nearly all of scientific research works).
C1: P2 + P3 In the case of superhuman AGI, since it will be able to escape human control and misaligned by default, the only survivable path to alignment cannot involve trial & error because the first failed try will result in an extinction-level outcome.
C2: P4 + C1 This means we will not survive superhuman AGI, because our survival would require alignment, towards which we have no survivable path: the only path we know of involves trial & error, which is not survivable.
C3: P1 + C2 Therefore the current trajectory of AI research which will produce superhuman AGI leads to an outcome where we do not survive.
No comments yet.