top | item 33721699

(no title)

origin_path | 3 years ago

The word "safety" doesn't normally encompass lying or, more appropriately in this case, saying something untrue without realizing it. That's considered a very different kind of problem. Safety normally means there's a direct chance of physical harm to someone.

This kind of elasticity in language use is the sort of thing that gives AI safety a bad name. You can't take AI research at face value if it's using strange re-definitions of common words.

discuss

order

naasking|3 years ago

How exactly does honestly not clearly fall under safety? AI is an information system, and truthfulness from an information system that impacts human lives is clearly a safety concern.

This is not a redefinition, the harm results from the standard usage of the tool. If the AI is being used to predict the possible future behaviour of adversarial countries, then you need the AI to be honest or lots of people could die. If the AI concludes that your adversary would be more friendly towards its programmed objectives, then it could conclude lying to the president is the optimal outcome.

This can show up in numerous other contexts. For instance, should a medical diagnostic AI be able to lie to you if lying to you will statistically improve your outcomes, say via the placebo effect? If so, should it also lie to the doctor managing your care to preserve that outcome, in case the doctor might slip and reveal the truth?

origin_path|3 years ago

How much software is safety critical in general, let alone software that uses deep learning? Very, very little. I'd actually be amazed if you can name a single case where someone has deployed a language model in a safety critical system. That's why your examples are all what-ifs.

There are no actual safety issues with LLMs, nor will there be any in the foreseeable future because nobody is using them in any context where such issues may arise. Hence why you're forced to rely on absurd hypotheticals like doctors blindly relying on LLMs for diagnostics without checking anything or thinking about the outputs.

There are honesty/accuracy issues. There are not safety issues. The conflation of "safety" with other unrelated language topics like whether people feel offended, whether something is misinformation or not is a very specific quirk of a very specific subculture in the USA, it's not a widely recognized or accepted redefinition.