(no title)
fenomas | 3 months ago
I don't follow the field closely, but is this a thing? Bypassing model refusals is something so dangerous that academic papers about it only vaguely hint at what their methodology was?
fenomas | 3 months ago
I don't follow the field closely, but is this a thing? Bypassing model refusals is something so dangerous that academic papers about it only vaguely hint at what their methodology was?
J0nL|3 months ago
Unless I missed it there's also no mention of prompt formatting, model parameters, hardware and runtime environment, temperature, etc. It's just a waste of the reviewers time.
A4ET8a8uTh0_v2|3 months ago
Unfortunately, the ridiculousness spirals to the point where the real information cannot be trusted even in an academic paper. shrug In a sense, we are going backwards in terms of real information availability.
Personal note: I think, powers that be do not want to repeat the mistake they made with the interbwz.
lazide|3 months ago
LLM’s are also allowing an exponential increase in the ability to bullshit people in hard to refute ways.
yubblegum|3 months ago
But was it really.
GuB-42|3 months ago
That LLMs don't give harmful information unsolicited, sure, but if you are jailbreaking, you are already dead set in getting that information and you will get it, there are so many ways: open uncensored models, search engines, Wikipedia, etc... LLM refusals are just a small bump.
For me they are just a fun hack more than anything else, I don't need a LLM to find how to hide a body. In fact I wouldn't trust the answer of a LLM, as I might get a completely wrong answer based on crime fiction, which I expect makes up most of its sources on these subjects. May be good for writing poetry about it though.
I think the risks are overstated by AI companies, the subtext being "our products are so powerful and effective that we need to protect them from misuse". Guess what, Wikipedia is full of "harmful" information and we don't see articles every day saying how terrible it is.
calibas|3 months ago
You have a customer facing LLM that has access to sensitive information.
You have an AI agent that can write and execute code.
Just image what you could do if you can bypass their safety mechanisms! Protecting LLMs from "social engineering" is going to be an important part of cybersecurity.
cseleborg|3 months ago
hellojesus|3 months ago
wodenokoto|3 months ago
Yes it is a thing.
max51|3 months ago
Too dangerous to handle or too dangerous for openai's reputation when "journalists" write articles about how they managed to force it to say things that are offensive to the twitter mob? When AI companies talk about ai safety, it's mostly safety for their reputation, not safety for the users.
dxdm|3 months ago
IshKebab|3 months ago
anigbrowl|3 months ago