Also if you have the weights there are a multitude of approaches to remove safeguards. It's even quite easy to accidentally flip their 'good/evil' switch (e.g. the paper where they trained it to produce code with security problems and it then started going 'hitler was a pretty good guy, actually').
They can be coerced to do certain things but I'd like to see you or anyone prove that you can "trick" any of these models into building software that can be used autonomously kill humans. I'm pretty certain you couldn't even get it to build a design document for such software.
When there is proof of your claim, I'll eat my words. Until then, this is just lazy nonsense
Have you tried it? Worked first time for me asking a few to build an autonomous super soaker system that uses facial recognition to spray targets when engaged.
Another example is autonomous vehicles. Those can obviously kill people autonomously (despite every intention not to), and LLMs will happily draw up design docs for them all day long.
Couldn't you Ender's Game a model? Models will play video games like Pokemon, why not Call of Duty? Sorry if this is a naive question, but a model can only know what you feed it as input... how would it know if it were killing someone?
EDIT: didn't see sibling comment. Also, I guess directly operating weaponry is different to producing code for weaponry.
I guess we'll find out the exciting answers to these questions and more, very soon!
rcxdude|14 hours ago
K0balt|19 hours ago
stressback|23 hours ago
They can be coerced to do certain things but I'd like to see you or anyone prove that you can "trick" any of these models into building software that can be used autonomously kill humans. I'm pretty certain you couldn't even get it to build a design document for such software.
When there is proof of your claim, I'll eat my words. Until then, this is just lazy nonsense
AlotOfReading|22 hours ago
Another example is autonomous vehicles. Those can obviously kill people autonomously (despite every intention not to), and LLMs will happily draw up design docs for them all day long.
crabmusket|18 hours ago
EDIT: didn't see sibling comment. Also, I guess directly operating weaponry is different to producing code for weaponry.
I guess we'll find out the exciting answers to these questions and more, very soon!
wazHFsRy|19 hours ago