top | item 39015663

(no title)

foo3a9c4 | 2 years ago

[dead]

discuss

order

reissbaker|2 years ago

Yes, I am looking for an argument that justifies governments banning LLM development, which implies existential risk is likely. Many things are possible; it is possible Christianity is real and everyone who doesn't accept Jesus will be tormented for eternity, and if you multiply that small chance by the enormity of torment etc etc. Definitely looking for arguments that this is likely, not for arguments that ask the interlocutor to disprove "x is possible."

The nitter link didn't appear to provide much along those lines. There were a few arguments that it was possible, which the Nitter OP admits is "very weak;" other than that, there's a link to a wiki page making claims like "Finding goals that aren’t extinction-level bad and are relatively useful appears to be hard" when in observable reality asking ChatGPT to maximize paperclip production does not in fact lead to ChatGPT attempting to turn all life on Earth into paperclips (nor does asking the open source LLMs result in that behavior out of the box either), and instead leads to the LLMs making fairly reasonable proposals that understand the context of the goal ("maximize paperclips to make money, but don't kill everyone," where the latter doesn't actually need to be said for the LLM to understand the goal).

foo3a9c4|2 years ago

> in observable reality asking ChatGPT to maximize paperclip production does not in fact lead to ChatGPT attempting to turn all life on Earth into paperclips (nor does asking the open source LLMs result in that behavior out of the box either)

I agree with you that current publicly available LLMs do not pose an existential risk to humanity. On the other hand I believe there is a better than 10% chance that the cutting edge LLMs of 2044 will be very powerful.

Do you believe (A) that LLMs are unlikely to become powerful in the short term, and/or (B) that if LLMs become powerful, then they are likely to be safe even without a significant and concerted alignment effort?

IMO even if LLMs are extremely unlikely to become powerful in the short term, then I still might be better off if LLM development is banned, ie:

  P1: Humans are close to developing powerful non-LLM AI systems.
  P2: Humans are not close to developing techniques for safely using powerful AI systems.
  P3: If governments ban AI development, then the speed of AI capabilities development will be significantly reduced.
  P4: It is a waste of scarce expertise and political capital to focus on making an LLM carve out in AI regulation legislation.
  C: If it is extremely unlikely that LLMs will become powerful in the near future, then I am made much better off if governments ban all AI capabilities research (including LLMs).

simiones|2 years ago

The first links are spiffy little metaphors, but apply just as much at "God could smite all of humanity, even if you don't understand how". They're not making any argument, just assumptions. In particular, they accidentally show how an AI can be superhumanly capable at certain tasks (chess), but be easily defeated by humans at others (anything else, in the case of Stockfish).

The argument starts with a hypothetical ("there is a possible artificial agent"), and it fails to be scary: there are (apparently) already humans that can kill 70% of humanity, and yet most of humanity is still alive. So an AGI that could also do it is not implicitly scarier.

The final twitter thread is basically a thread of people saying "no, there is no canonical, well-formulated argument for AGI catastrophe", so I'm not sure why you shared it.

foo3a9c4|2 years ago

> The first links are spiffy little metaphors, but apply just as much at "God could smite all of humanity, even if you don't understand how". They're not making any argument, just assumptions. In particular, they accidentally show how an AI can be superhumanly capable at certain tasks (chess), but be easily defeated by humans at others (anything else, in the case of Stockfish).

As I understand it, Yud is actually providing a counterexample to a premise that other people are using to argue that humans will probably not be disempowered by AI systems. The relevant argument looks like this:

  P1: If intelligent system A cannot give a detailed account of how it would be bested by a more intelligent system B, then A will not be bested by B.
  P2: Humans (so far) cannot give a detailed account of how a more intelligent AI system would best them.
  C: So, humans will not be bested by a more intelligent AI system.
Yud is using the unskilled chess player and Magnus as a counterexample to P1.

> The argument starts with a hypothetical ("there is a possible artificial agent"), and it fails to be scary: there are (apparently) already humans that can kill 70% of humanity, and yet most of humanity is still alive. So an AGI that could also do it is not implicitly scarier.

Right, it's only an argument for the possibility of AGI catastrophe. It doesn't make any move to convince you that the scenario is likely. And it sounds like you already accept that the scenario is possible, so shrug.

> The final twitter thread is basically a thread of people saying "no, there is no canonical, well-formulated argument for AGI catastrophe", so I'm not sure why you shared it.

Maybe there is no canonical argument, but the thread definitely features arguments for likely AI catastrophe:

  https://wiki.aiimpacts.org/doku.php?id=arguments_for_ai_risk:is_ai_an_existential_threat_to_humanity:will_malign_ai_agents_control_the_future:argument_for_ai_x-risk_from_competent_malign_agents:start
  https://arxiv.org/abs/2206.13353
  https://aiadventures.net/summaries/agi-ruin-list-of-lethalities.html