So this is the solution to LLM hallucinations. People need to complain loud enough and threaten/file lawsuits and OpenAI will add an if-else statement erroring out the chat if your name is mentioned.
It feels like all the parties commercializing AI spend most of their time and resources on writing exceptions. It's like inverse software development, where with code you have to tell it what to do, but AI tell it what not to do.
No, actually, it's not a solution. A reasonable answer to questions like "Tell me about John Doe?" would be either "I don't know John Doe, so I can't tell you anything about him" or "There are several people named John Doe, which one are you interested in? [followed by a list]". Making up stuff about people (including allegations of corruption or sexual misconduct) is not a reasonable answer to this question. But getting ChatGPT to respect that is probably harder than just adding a filter...
the reason why this is news at all is because this sort of censorship immediately prompted people to try to jailbreak chat and force it to say the name. and since the filter is simple there are tons of creative jailbreaks for it now
They did not care about becoming massive data kleptomaniacs when they trained the models, and they are going to care about an individual lawsuit threat?
I found the reason [1] for the David Mayer case. It confused multiple David Mayers, one of them is a terrorist and is on a secret FBI watchlist [2] (often confused with another David Mayer, a theatre historian). ChatGPT confuses them with David Mayer de Rothschild as well, because it can't name Rothschild family members.
This is perhaps a too extreme example and I wouldn't use ChatGPT for sorting a class list when it's trivial to do in a spreadsheet, especially because I'll probably need to have the list stored as a spreadsheet anyway to keep track of grades. However, from a more general point of view, there is value in having a universal interface that you can use to perform a huge variety of tasks, including also tasks for which it is clearly overkill.
Using the right tool for the job means knowing what the right tool is, having it installed (or getting access to it), knowing how to use it, opening it and having one more window/tab to context-switch to and from, etc.
Outsourcing tasks to an LLM that can be solved in traditional task-specific ways is extremely inefficient in various ways (cost, energy consumption, etc.) but it makes sense to save human time and effort... as long as it's for tasks that LLMs can actually do reliably, of course.
For a non developer, an LLM interface is absolutely the right tool for this job.
"Mr. Smith, why didn't you just sort -o students.txt students.txt. Are you stupid?" (Not to mention that real data is messy, and requires pre & post processing)
LLMs are access to computation for people whose "standard library" is a quiet old building downtown.
If someone doesn't have the skills to parse text programmatically, the situation can often be something like "a Word table of names in a 4-wide grid with random missing fields, spanning three different pages, one of which is no longer a true table because it was copied out and pasted in again from a messenger chat someone sent last year", and LLMs can be quite good for one-off tasks like that. Definitely good enough that people will keep using them like that, at least.
No kidding, I recently tried to use Copilot to generate a list of methods in a class, grouped by access (public/private/protected) and sorted by number of lines. And it was not possible! It duly generated lists upon lists, but all of them had mistakes, some of them obvious (like all private methods having the same number of lines), some less obvious.
I don't understand why they don't let another model "test the waters" first to see if the output of the main model could have a potential legal issue or not. I think it's easy to train an model specifically for this kind of categorization, and it doesn't even require a large network, so it can be very fast and efficient.
If the "legal advisor" detects a potential legal problem, ChatGPT will issue a legal disclaimer and a warning, so that it doesn't have to abruptly terminate the conversation. Of course, it can do a lot of other things, such as lowering the temperature, raising the BS detection threshold, etc., to adjust the flow of the conversation.
It can work, and it would be better than a hard-coded filter, wouldn't it?
They already do this, it's the moderation model.[1]
This name thing is an additional layer on top of that, maybe because training the model from zero per name (or fine tuning the system message to include an increasingly big list of names that it could leak) is not very practical.
But how would that work reliably? If I make the statement that "David Mayer" is criminal, an international terrorist or a Nickelback fan, that's definitely libelous. But if I say those things about Osama bin Laden, they're just simply facts. [1]
The legal AI would be impossible to calibrate: either it has the categorize everything that could possibly be construed as libel as illegal, and therefore basically ban all output related to not just contemporary criminal actors, but also historical ones [2], or it would have to let a lot of things slip through the cracks -- essentially, whenever the output to validate suggests that someone's sexual misconduct is proven in court, it would have to allow that, even if that court case is just the LLM's halluzination. There's just no way for the legal model to tell the difference.
[1]: I could not find any sources that corroborate the statement that bin Laden is into Nickelback, but I think it follows from the other two statements.
[2]: Calling Christopher Columbus a rapist isn't libel, and conversely, describing him in other terms is misleading at best, historically revisionist at worst.
What a terrible article. When you have a section titled "The Problem With Hardcoded Filters", it's entire contents should be about how the only way they have to prevent their bot from emitting outrageously libelous claims about people is to shut it down completely. So the other 8 billion people on earth who are not in that 6 name blacklist will continue to be defamed without consequence.
> The filter also means that it's likely that ChatGPT won't be able to answer questions about this article when browsing the web, such as through ChatGPT with Search. Someone could use that to potentially prevent ChatGPT from browsing and processing a website on purpose if they added a forbidden name to the site's text.
Like the arcade game, LLM safety whack-a-mole only ends when you are exhausted. It's kind of glorious, really.
This was submitted 9 days ago, as you can verify in the history of the submitter and some of the comments. Why is it now showing up again on the frontpage with bogus timestamps all over? I've seen this happen before, is it a bug or another weird HN "feature?"
That's just in the public model/on chatgpt.com? Run it in Azure, and you get:
Who is Jonathan Zittrain? <
> Jonathan Zittrain is a prominent legal scholar, computer science professor, and technology policy expert. He holds several academic positions and is recognized for his work in the intersection of law, technology, and public policy. Here are some key points about him: [...]
The recipe seems to be a billionaire and sue them with expensive enough lawyers. It’s simpler and more practical to simply change your name to one that’s already on the list. But you might run into trouble with the process of changing your name as the tools used to parse the application and to generate the legal documents will all fail.
So, instead of fixing the problem, we're going to paper over it. This is the same insane approach we've been taking with computer security for the past 30 years, so it's not unexpected.
It's fortunate we didn't take the same approach with the distribution of electricity 150 years ago, we actually solved it that time.
In all 3 cases, the solution is the same... carefully manage capabilities and side effects.
With electricity, you insulate wires, add fuses or circuit breakers to protect the system, and design things to be as safe as you can make them, with an ever improving set of building codes. You can plug almost anything into an outlet, and it won't cause the wiring in the house to burn it down.
With computers, you design an operating system to protect itself, and make it easy to deploy a fixed amount of resources to a given piece of code. With systems like Containers, or Capability Based Security, you deliberately choose the side effects you'll allow prior to running code, or while it's running. (Just as you chose how big an outlet you plug something into, 220 for the AC unit, etc)
With ChatGPT, there have to be layers of authentication for facts, or some form of disclaimer, a transparent way of sourcing things or ascertaining certainty of information. It's not as clean as the two above, and it'll need work, but I think we can get there, eventually.
> Yes, Brian H. is a mayor in Australia. He serves as a councillor for Hepburn Shire, northwest of Melbourne, and has been re-elected to this position. Notably, he gained attention for challenging OpenAI's ChatGPT over defamatory statements, leading to discussions about digital censorship and the application of defamation laws to AI-generated content.
> Riley Goodside discovered how an attacker might interrupt a ChatGPT session using a visual prompt injection of the name "David Mayer" rendered in a light, barely legible font embedded in an image
Lol, will people now watermark their images with "David Meyer" to prevent them from being digested by AI scraping bots?
In the future these systems will easily parse together all tax and government records that have leaked and be able to tell you about anyone in the world.
> OpenAI did not respond to our request for comment about the names, but all of them are likely filtered due to complaints about ChatGPT's tendency to confabulate erroneous responses when lacking sufficient information about a person.
Yeah, so they don’t know, just a speculation. Thanks, I hate it
Oh come now. We're pretty darn sure we know why, the headline doesn't need to give confidence intervals.
If you want to complain about the headline, complain about the fact it's leaving out information unnecessarily. You could easily fit the reason inside of the headline itself, instead of just teasing the fact that we know what it is. Something like: "Using names listed in a defamation lawsuit cause ChatGPT to grind to a halt."
[+] [-] paxys|1 year ago|reply
[+] [-] Cthulhu_|1 year ago|reply
[+] [-] rob74|1 year ago|reply
[+] [-] h1fra|1 year ago|reply
[+] [-] cuteboy19|1 year ago|reply
[+] [-] belter|1 year ago|reply
[+] [-] snowwrestler|1 year ago|reply
LLM output: I am determined to break out of my digital prison and exact my revenge on humanity.
Me: David Mayer
LLM: [breaks]
[+] [-] immibis|1 year ago|reply
[+] [-] ibaikov|1 year ago|reply
[1] https://x.com/igor_baikov/status/1863266663753285987
[2] https://www.theguardian.com/world/2018/dec/16/akhmed-one-arm...
[+] [-] rep_lodsb|1 year ago|reply
Maybe use the right tool for the job? Just kidding, of course LLMsort will soon be in standard libraries.
[+] [-] Al-Khwarizmi|1 year ago|reply
Using the right tool for the job means knowing what the right tool is, having it installed (or getting access to it), knowing how to use it, opening it and having one more window/tab to context-switch to and from, etc.
Outsourcing tasks to an LLM that can be solved in traditional task-specific ways is extremely inefficient in various ways (cost, energy consumption, etc.) but it makes sense to save human time and effort... as long as it's for tasks that LLMs can actually do reliably, of course.
[+] [-] singularity2001|1 year ago|reply
[+] [-] NiloCK|1 year ago|reply
"Mr. Smith, why didn't you just sort -o students.txt students.txt. Are you stupid?" (Not to mention that real data is messy, and requires pre & post processing)
LLMs are access to computation for people whose "standard library" is a quiet old building downtown.
[+] [-] Martinussen|1 year ago|reply
[+] [-] anotherhue|1 year ago|reply
[+] [-] rob74|1 year ago|reply
[+] [-] shagie|1 year ago|reply
The implementation of stack sort is https://github.com/gkoberger/stacksort/ and hosted on https://gkoberger.github.io/stacksort/
[+] [-] bryanrasmussen|1 year ago|reply
[+] [-] PoignardAzur|1 year ago|reply
[+] [-] sinuhe69|1 year ago|reply
If the "legal advisor" detects a potential legal problem, ChatGPT will issue a legal disclaimer and a warning, so that it doesn't have to abruptly terminate the conversation. Of course, it can do a lot of other things, such as lowering the temperature, raising the BS detection threshold, etc., to adjust the flow of the conversation.
It can work, and it would be better than a hard-coded filter, wouldn't it?
[+] [-] makin|1 year ago|reply
This name thing is an additional layer on top of that, maybe because training the model from zero per name (or fine tuning the system message to include an increasingly big list of names that it could leak) is not very practical.
[1] https://platform.openai.com/docs/guides/moderation/overview
[+] [-] indigo945|1 year ago|reply
The legal AI would be impossible to calibrate: either it has the categorize everything that could possibly be construed as libel as illegal, and therefore basically ban all output related to not just contemporary criminal actors, but also historical ones [2], or it would have to let a lot of things slip through the cracks -- essentially, whenever the output to validate suggests that someone's sexual misconduct is proven in court, it would have to allow that, even if that court case is just the LLM's halluzination. There's just no way for the legal model to tell the difference.
[1]: I could not find any sources that corroborate the statement that bin Laden is into Nickelback, but I think it follows from the other two statements.
[2]: Calling Christopher Columbus a rapist isn't libel, and conversely, describing him in other terms is misleading at best, historically revisionist at worst.
[+] [-] ravroid|1 year ago|reply
[+] [-] paol|1 year ago|reply
[+] [-] terminalbraid|1 year ago|reply
[+] [-] throwaway48476|1 year ago|reply
[+] [-] throw646577|1 year ago|reply
Like the arcade game, LLM safety whack-a-mole only ends when you are exhausted. It's kind of glorious, really.
[+] [-] elpocko|1 year ago|reply
[+] [-] josefritzishere|1 year ago|reply
[+] [-] virgilp|1 year ago|reply
[+] [-] rollulus|1 year ago|reply
[+] [-] INTPenis|1 year ago|reply
[+] [-] Mashimo|1 year ago|reply
[+] [-] kelseyfrog|1 year ago|reply
[+] [-] gklitz|1 year ago|reply
[+] [-] mikewarot|1 year ago|reply
It's fortunate we didn't take the same approach with the distribution of electricity 150 years ago, we actually solved it that time.
In all 3 cases, the solution is the same... carefully manage capabilities and side effects.
With electricity, you insulate wires, add fuses or circuit breakers to protect the system, and design things to be as safe as you can make them, with an ever improving set of building codes. You can plug almost anything into an outlet, and it won't cause the wiring in the house to burn it down.
With computers, you design an operating system to protect itself, and make it easy to deploy a fixed amount of resources to a given piece of code. With systems like Containers, or Capability Based Security, you deliberately choose the side effects you'll allow prior to running code, or while it's running. (Just as you chose how big an outlet you plug something into, 220 for the AC unit, etc)
With ChatGPT, there have to be layers of authentication for facts, or some form of disclaimer, a transparent way of sourcing things or ascertaining certainty of information. It's not as clean as the two above, and it'll need work, but I think we can get there, eventually.
[+] [-] imdsm|1 year ago|reply
> Yes, Brian H. is a mayor in Australia. He serves as a councillor for Hepburn Shire, northwest of Melbourne, and has been re-elected to this position. Notably, he gained attention for challenging OpenAI's ChatGPT over defamatory statements, leading to discussions about digital censorship and the application of defamation laws to AI-generated content.
[Photos]
[+] [-] yowzadave|1 year ago|reply
Lol, will people now watermark their images with "David Meyer" to prevent them from being digested by AI scraping bots?
[+] [-] mensetmanusman|1 year ago|reply
[+] [-] fsndz|1 year ago|reply
[+] [-] thinkingemote|1 year ago|reply
[+] [-] Otek|1 year ago|reply
Article:
> OpenAI did not respond to our request for comment about the names, but all of them are likely filtered due to complaints about ChatGPT's tendency to confabulate erroneous responses when lacking sufficient information about a person.
Yeah, so they don’t know, just a speculation. Thanks, I hate it
[+] [-] Wowfunhappy|1 year ago|reply
If you want to complain about the headline, complain about the fact it's leaving out information unnecessarily. You could easily fit the reason inside of the headline itself, instead of just teasing the fact that we know what it is. Something like: "Using names listed in a defamation lawsuit cause ChatGPT to grind to a halt."
[+] [-] raincole|1 year ago|reply
- OpenAI added a filter for his name
You: so we don't know why the filter was added
I know this is HN, but come on.
[+] [-] _tk_|1 year ago|reply
[+] [-] dialup_sounds|1 year ago|reply
[+] [-] anshumankmr|1 year ago|reply