top | item 39222645

(no title)

foo3a9c4 | 2 years ago

Thank you for the thoughtful reply.

> Premise 3 is where the problem is, of course.

I don't believe premise 3 is a problem exactly, but I do believe that it is a non-trivial challenge to determine whether or not it is true.

> We have no idea how to build AGI. We know LLMs won't be it.

> Even if we create AGI, we have no indication it is possible to build a orders-of-magnitude more "intelligent" thing. This is predicated entirely on the notion that if you can do it at scale, you get more, and there's no evidence thinking more makes for more intelligence.

> Even if that were possible and we build an ASI, it's not at all clear this would lead to existential catastrophe. An ASI is presumably smart enough to see it's about to end the world as we know it, and knows where its power supply comes from.

> This leaves us with an xrisk probability so close to zero it's virtually indistinguishable from zero. The only way to make it mean anything is "let's multiply it with infinity" - "it will end humanity, and my own survival is endangered".

It looks to me that you are making the following argument:

  Premise G1. Humans do not currently know how to build AGI.
  Premise G2. It might be impossible to build ASI.
  Premise G3. It is unclear how likely an ASI is to cause an existential catastrophe.
  Conclusion. There is not a significant chance of catastrophe from ASI.

I believe that argument is about an important point (chance of AI catastrophe) and that it is a pretty good argument. But the original premise 3 says, "If ASI is built before alignment is understood, then there is a significant chance of existential catastrophe.", so AFAICT your argument doesn't substantively address it. (ie, your argument's conclusion doesn't tell me anything about whether or not premise 3 is true)

I apologize if I have misunderstood your point.

> Alignment is a tool that works with LLMs, but we don't know if it will work for whatever produces AGI.

We may be using the word "alignment" slightly differently. By "alignment" I just meant getting the algorithmic system to have precisely the goal that its human programmers want it to have. I would call, for example, RLHF a "tool" for trying to achieve alignment.

How do you want to use the terms "alignment" and "alignment tool" going forward in the discussion?

> Meanwhile, ordinary humans can use currently existing tools to end the world just fine. Nukes are readily available. We're obviously not really interested in public health. Climate refugees will be a giant problem soon-ish. The economy is very much a house of cards, but a house of cards that keeps society functioning as-is.

I agree that there are other plausible sources of catastrophe for humans, to name a few others: asteroids, supervolcanoes and population collapse.

I understand you to be making a new point now, but I just want to state that I do not believe the existence of other plausible existential threats to be a rebuttal of premise 3.

> LLMs are a fantastic disinfo tool right now. There's a reasonably good chance they will calcify biases. They will cause large economic damage because 1) they lift up the baseline of work, and 2) they're just good enough that there's economic incentive to replace workers with it, but 3) they're shitty enough that the resulting output will ultimately be worse because we removed humans from the loop.

I agree that LLMs may plausibly cause significant harm in the short term via disinformation and unemployment.

And again, I understand you to be making a new point, but I just want to state that I do not believe the plausibility of such LLM harms is a rebuttal against premise 3.

> Those are actual risks. That we sweep under the carpet, because "xrisk" makes for much more grabby headlines.

I'm not sure who you mean by "we" here, so I'm not sure if your claim about them is true or not.

discuss

No comments yet.