top | item 34835225

(no title)

bkfunk | 3 years ago

Suppose:

1. An AI model, which works simply by predicting the best next word given a particular context (context which obviously includes whatever prompts it was given), becomes sufficiently sophisticated that, given some particular context, it can produce the text (so, normal language but also, say, shell commands and, like, python) needed to hack into, let’s say, a natural gas pipeline company’s systems and fiddle with the pressurization such that it causes a catastrophic explosion that at the very least stops gas from going to millions of homes in the dead of winter. 2. The AI system’s representation of “context” is sufficiently sophisticated enough that it can, when given some simple linguistic prompts, can represent (even if it does not “understand”) the intent or latent meanings of the human providing the prompts (or rather, the multiple humans, including those who built it and primed it with some initial set of rules and goals, as well as the perhaps unwitting or perhaps nefarious end user). 3. The AI system’s produced text can evoke additional context, either in a conversation where the AI can lead a human down a path of providing additional prompts, or where the AI system can interact with systems that might give responses to the original text (including erroneous responses). 4. The AI is trained on the corpus of human culture (and not just the parts we like!), including both fictional and nonfictional accounts of and tutorials on hacking, war, terrorism, revolutionary mobilization, heroic tales of overthrowing oppressive regimes, etc. (That is, for every conflict, every actor’s glorification and vilification is included in the corpus, and there is no way to ensure that the “correct” versions, that is the versions that we as people steeped in Western ethics and systems of morality, are labeled as such or weighted more 5. The AI system’s text can be ported to a shell, servers on the internet, or even just to humans who have access to such methods of turning text into real world interactions.

What might happen? Is it conceivable that a bad actor could use the AI to do malicious work in the real world that the actor did not have the skills to execute themselves? Is it conceivable that a jokester could play around, but that the AI could misunderstand the intent and/or psychologically influence them into providing prompts that lead to disastrous outcomes? Could someone asking it to fix climate change cause it, correctly IMO, to develop a contextual representation that we have to move away from fossil fuels with great urgency even if it causes some short term economic harm, and then, incorrectly IMO, develop a contextual representation that economic ecoterrorism is the best way to do that?

In other words: If it can, through benign intent, malicious intent, or simply through misunderstanding or error, cause great harm, who cares it it is just an algorithm, with no sentience or “beingness”? If it’s below freezing and you don’t have heat for weeks, who cares if the AI can be properly said to be “thinking” or is just responding to stimuli?

discuss

No comments yet.