top | item 39017139

(no title)

foo3a9c4 | 2 years ago

> in observable reality asking ChatGPT to maximize paperclip production does not in fact lead to ChatGPT attempting to turn all life on Earth into paperclips (nor does asking the open source LLMs result in that behavior out of the box either)

I agree with you that current publicly available LLMs do not pose an existential risk to humanity. On the other hand I believe there is a better than 10% chance that the cutting edge LLMs of 2044 will be very powerful.

Do you believe (A) that LLMs are unlikely to become powerful in the short term, and/or (B) that if LLMs become powerful, then they are likely to be safe even without a significant and concerted alignment effort?

IMO even if LLMs are extremely unlikely to become powerful in the short term, then I still might be better off if LLM development is banned, ie:

  P1: Humans are close to developing powerful non-LLM AI systems.
  P2: Humans are not close to developing techniques for safely using powerful AI systems.
  P3: If governments ban AI development, then the speed of AI capabilities development will be significantly reduced.
  P4: It is a waste of scarce expertise and political capital to focus on making an LLM carve out in AI regulation legislation.
  C: If it is extremely unlikely that LLMs will become powerful in the near future, then I am made much better off if governments ban all AI capabilities research (including LLMs).

discuss

reissbaker|2 years ago

I believe that the proposals referenced in the article from current AI safety organizations that would make current-gen open-source LLMs illegal due to supposed x-risk are not supported by reality.

Arguing about theoretical AI models 30 years from now that might or might not be dangerous doesn't seem very convincing to me, since we don't know what they'll be based on or how they'll work — researchers today aren't even sure LLMs can scale to super-human intelligence. Similarly, pre-LLMs many safetyist orgs took the "paperclip problem" very seriously, when it's quite clear now that even the not-very-intelligent LLMs of today are capable of understanding the implicit context of a goal like that and won't seriously propose extinguishing humanity as a mechanism to improve paperclip production. Anthropic was formed in part because people thought gpt-3.5-turbo was existentially risky! And I don't think anyone today entertains that thought seriously, to put it lightly.

Trying to ban AI now due to supposed existential risks of systems in the future that don't currently exist and we don't know how to build (and we don't know if the failure modes proposed by the safety orgs will actually exist) seems like putting the cart well before the horse.