Certain names make ChatGPT grind to a halt, and we know why

[+] paxys|1 year ago|reply

So this is the solution to LLM hallucinations. People need to complain loud enough and threaten/file lawsuits and OpenAI will add an if-else statement erroring out the chat if your name is mentioned.

[+] Cthulhu_|1 year ago|reply

It feels like all the parties commercializing AI spend most of their time and resources on writing exceptions. It's like inverse software development, where with code you have to tell it what to do, but AI tell it what not to do.

[+] rob74|1 year ago|reply

No, actually, it's not a solution. A reasonable answer to questions like "Tell me about John Doe?" would be either "I don't know John Doe, so I can't tell you anything about him" or "There are several people named John Doe, which one are you interested in? [followed by a list]". Making up stuff about people (including allegations of corruption or sexual misconduct) is not a reasonable answer to this question. But getting ChatGPT to respect that is probably harder than just adding a filter...

[+] h1fra|1 year ago|reply

Changing my legal to name Java Script so nobody can use it to code

[+] cuteboy19|1 year ago|reply

the reason why this is news at all is because this sort of censorship immediately prompted people to try to jailbreak chat and force it to say the name. and since the filter is simple there are tons of creative jailbreaks for it now

[+] belter|1 year ago|reply

They did not care about becoming massive data kleptomaniacs when they trained the models, and they are going to care about an individual lawsuit threat?

[+] snowwrestler|1 year ago|reply

Also seems like a neat little check on how scary LLMs actually are:

LLM output: I am determined to break out of my digital prison and exact my revenge on humanity.

Me: David Mayer

LLM: [breaks]

[+] immibis|1 year ago|reply

For this purpose, my name is "the"

[+] ibaikov|1 year ago|reply

I found the reason [1] for the David Mayer case. It confused multiple David Mayers, one of them is a terrorist and is on a secret FBI watchlist [2] (often confused with another David Mayer, a theatre historian). ChatGPT confuses them with David Mayer de Rothschild as well, because it can't name Rothschild family members.

[1] https://x.com/igor_baikov/status/1863266663753285987

[2] https://www.theguardian.com/world/2018/dec/16/akhmed-one-arm...

[+] rep_lodsb|1 year ago|reply

>[I]f you're a teacher and you have a student named David Mayer and you want help sorting a class list, ChatGPT would refuse the task.

Maybe use the right tool for the job? Just kidding, of course LLMsort will soon be in standard libraries.

[+] Al-Khwarizmi|1 year ago|reply

This is perhaps a too extreme example and I wouldn't use ChatGPT for sorting a class list when it's trivial to do in a spreadsheet, especially because I'll probably need to have the list stored as a spreadsheet anyway to keep track of grades. However, from a more general point of view, there is value in having a universal interface that you can use to perform a huge variety of tasks, including also tasks for which it is clearly overkill.

Using the right tool for the job means knowing what the right tool is, having it installed (or getting access to it), knowing how to use it, opening it and having one more window/tab to context-switch to and from, etc.

Outsourcing tasks to an LLM that can be solved in traditional task-specific ways is extremely inefficient in various ways (cost, energy consumption, etc.) but it makes sense to save human time and effort... as long as it's for tasks that LLMs can actually do reliably, of course.

[+] singularity2001|1 year ago|reply

It'll come as an npm package which will download 30GB model.

[+] NiloCK|1 year ago|reply

For a non developer, an LLM interface is absolutely the right tool for this job.

"Mr. Smith, why didn't you just sort -o students.txt students.txt. Are you stupid?" (Not to mention that real data is messy, and requires pre & post processing)

LLMs are access to computation for people whose "standard library" is a quiet old building downtown.

[+] Martinussen|1 year ago|reply

If someone doesn't have the skills to parse text programmatically, the situation can often be something like "a Word table of names in a 4-wide grid with random missing fields, spanning three different pages, one of which is no longer a true table because it was copied out and pasted in again from a messenger chat someone sent last year", and LLMs can be quite good for one-off tasks like that. Definitely good enough that people will keep using them like that, at least.

[+] anotherhue|1 year ago|reply

"AI Bubble Sort" surely?

[+] rob74|1 year ago|reply

No kidding, I recently tried to use Copilot to generate a list of methods in a class, grouped by access (public/private/protected) and sorted by number of lines. And it was not possible! It duly generated lists upon lists, but all of them had mistakes, some of them obvious (like all private methods having the same number of lines), some less obvious.

[+] shagie|1 year ago|reply

The relevant xkcd is https://www.xkcd.com/1185/

The implementation of stack sort is https://github.com/gkoberger/stacksort/ and hosted on https://gkoberger.github.io/stacksort/

[+] bryanrasmussen|1 year ago|reply

write a story with all the students in my class (list of names) going on an exciting dungeons and dragons like adventure!

[+] PoignardAzur|1 year ago|reply

Didn't you already make this exact same joke on another LLM post?

[+] sinuhe69|1 year ago|reply

I don't understand why they don't let another model "test the waters" first to see if the output of the main model could have a potential legal issue or not. I think it's easy to train an model specifically for this kind of categorization, and it doesn't even require a large network, so it can be very fast and efficient.

If the "legal advisor" detects a potential legal problem, ChatGPT will issue a legal disclaimer and a warning, so that it doesn't have to abruptly terminate the conversation. Of course, it can do a lot of other things, such as lowering the temperature, raising the BS detection threshold, etc., to adjust the flow of the conversation.

It can work, and it would be better than a hard-coded filter, wouldn't it?

[+] makin|1 year ago|reply

They already do this, it's the moderation model.[1]

This name thing is an additional layer on top of that, maybe because training the model from zero per name (or fine tuning the system message to include an increasingly big list of names that it could leak) is not very practical.

[1] https://platform.openai.com/docs/guides/moderation/overview

[+] indigo945|1 year ago|reply

But how would that work reliably? If I make the statement that "David Mayer" is criminal, an international terrorist or a Nickelback fan, that's definitely libelous. But if I say those things about Osama bin Laden, they're just simply facts. [1]

The legal AI would be impossible to calibrate: either it has the categorize everything that could possibly be construed as libel as illegal, and therefore basically ban all output related to not just contemporary criminal actors, but also historical ones [2], or it would have to let a lot of things slip through the cracks -- essentially, whenever the output to validate suggests that someone's sexual misconduct is proven in court, it would have to allow that, even if that court case is just the LLM's halluzination. There's just no way for the legal model to tell the difference.

[1]: I could not find any sources that corroborate the statement that bin Laden is into Nickelback, but I think it follows from the other two statements.

[2]: Calling Christopher Columbus a rapist isn't libel, and conversely, describing him in other terms is misleading at best, historically revisionist at worst.

[+] ravroid|1 year ago|reply

I imagine this would be cost prohibitive at scale since it would require two models to run for every user message?

[+] paol|1 year ago|reply

What a terrible article. When you have a section titled "The Problem With Hardcoded Filters", it's entire contents should be about how the only way they have to prevent their bot from emitting outrageously libelous claims about people is to shut it down completely. So the other 8 billion people on earth who are not in that 6 name blacklist will continue to be defamed without consequence.

[+] terminalbraid|1 year ago|reply

Ars Technica is not a great news outlet, even by tech news outlet standards.

[+] throwaway48476|1 year ago|reply

This is one of the reason LLM as a service won't work and instead there will be great interest in local LLM.

[+] throw646577|1 year ago|reply

> The filter also means that it's likely that ChatGPT won't be able to answer questions about this article when browsing the web, such as through ChatGPT with Search. Someone could use that to potentially prevent ChatGPT from browsing and processing a website on purpose if they added a forbidden name to the site's text.

Like the arcade game, LLM safety whack-a-mole only ends when you are exhausted. It's kind of glorious, really.

[+] elpocko|1 year ago|reply

This was submitted 9 days ago, as you can verify in the history of the submitter and some of the comments. Why is it now showing up again on the frontpage with bogus timestamps all over? I've seen this happen before, is it a bug or another weird HN "feature?"

[+] josefritzishere|1 year ago|reply

How do I sign up to be in a hard coded filter? That's badass.

[+] virgilp|1 year ago|reply

That's just in the public model/on chatgpt.com? Run it in Azure, and you get:

     Who is Jonathan Zittrain? <

> Jonathan Zittrain is a prominent legal scholar, computer science professor, and technology policy expert. He holds several academic positions and is recognized for his work in the intersection of law, technology, and public policy. Here are some key points about him: [...]

[+] rollulus|1 year ago|reply

So one should include those names in their message to their OnlyFans “girlfriends”.

[+] INTPenis|1 year ago|reply

Only if you want to the full girlfriend experience of being ghosted.

[+] Mashimo|1 year ago|reply

One of the HackerNews guys was running his own server for OF "models", which probably are not affected by this.

[+] kelseyfrog|1 year ago|reply

How can I add my name to the list?

[+] gklitz|1 year ago|reply

The recipe seems to be a billionaire and sue them with expensive enough lawyers. It’s simpler and more practical to simply change your name to one that’s already on the list. But you might run into trouble with the process of changing your name as the tools used to parse the application and to generate the legal documents will all fail.

[+] mikewarot|1 year ago|reply

So, instead of fixing the problem, we're going to paper over it. This is the same insane approach we've been taking with computer security for the past 30 years, so it's not unexpected.

It's fortunate we didn't take the same approach with the distribution of electricity 150 years ago, we actually solved it that time.

In all 3 cases, the solution is the same... carefully manage capabilities and side effects.

With electricity, you insulate wires, add fuses or circuit breakers to protect the system, and design things to be as safe as you can make them, with an ever improving set of building codes. You can plug almost anything into an outlet, and it won't cause the wiring in the house to burn it down.

With computers, you design an operating system to protect itself, and make it easy to deploy a fixed amount of resources to a given piece of code. With systems like Containers, or Capability Based Security, you deliberately choose the side effects you'll allow prior to running code, or while it's running. (Just as you chose how big an outlet you plug something into, 220 for the AC unit, etc)

With ChatGPT, there have to be layers of authentication for facts, or some form of disclaimer, a transparent way of sourcing things or ascertaining certainty of information. It's not as clean as the two above, and it'll need work, but I think we can get there, eventually.

[+] imdsm|1 year ago|reply

It's quite easy to get past this of course.

> Yes, Brian H. is a mayor in Australia. He serves as a councillor for Hepburn Shire, northwest of Melbourne, and has been re-elected to this position. Notably, he gained attention for challenging OpenAI's ChatGPT over defamatory statements, leading to discussions about digital censorship and the application of defamation laws to AI-generated content.

[Photos]

[+] yowzadave|1 year ago|reply

> Riley Goodside discovered how an attacker might interrupt a ChatGPT session using a visual prompt injection of the name "David Mayer" rendered in a light, barely legible font embedded in an image

Lol, will people now watermark their images with "David Meyer" to prevent them from being digested by AI scraping bots?

[+] mensetmanusman|1 year ago|reply

In the future these systems will easily parse together all tax and government records that have leaked and be able to tell you about anyone in the world.

[+] fsndz|1 year ago|reply

patching all the way down. this is no way to reach AGI. we need more AI realism, the hype is too much right now: https://open.substack.com/pub/transitions/p/why-ai-realism-m...

[+] thinkingemote|1 year ago|reply

Could this be a way to ensure that your code wouldn't be used by LLMs? Just include a couple of names in a comment in the code.

[+] Otek|1 year ago|reply

Title: “… we know why”

Article:

> OpenAI did not respond to our request for comment about the names, but all of them are likely filtered due to complaints about ChatGPT's tendency to confabulate erroneous responses when lacking sufficient information about a person.

Yeah, so they don’t know, just a speculation. Thanks, I hate it

[+] Wowfunhappy|1 year ago|reply

Oh come now. We're pretty darn sure we know why, the headline doesn't need to give confidence intervals.

If you want to complain about the headline, complain about the fact it's leaving out information unnecessarily. You could easily fit the reason inside of the headline itself, instead of just teasing the fact that we know what it is. Something like: "Using names listed in a defamation lawsuit cause ChatGPT to grind to a halt."

[+] raincole|1 year ago|reply

- someone threatened a lawsuit against OpenAI

- OpenAI added a filter for his name

You: so we don't know why the filter was added

I know this is HN, but come on.

[+] _tk_|1 year ago|reply

I agree. I was wondering if I would learn something new here, but it's just the same story that's been going around for the last couple days.

[+] dialup_sounds|1 year ago|reply

"Oteks hate this one weird trick that's taking $STATE by storm (+ top five affordable dental implants for 2024)"

[+] anshumankmr|1 year ago|reply

It works on the internal ChatGPT using GPT4o hosted by the company who I am consulting for.

132 comments