Hi all, I work on security at OpenAI. We have looked into this report and the model response does not contain outputs from any other users nor does it reflect a security vulnerability, compromise, or exploit.
The original report was that submitting a message close to (but not quite) 1500 seconds to the audio transcription API would result in weird, unrelated, off-topic responses that look like they might be replies to someone else’s query. This is not what’s happening. Our API has a bug where if the tokenization of the audio (which is not strictly correlated with the audio length) exceeds a limit, the entire input is truncated, and the model effectively receives a blank query. We’re working with our API team to get this fixed and to produce more useful error messages.
When the model receives an empty query, it generates a response by selecting one random token, then another (which is influenced by the first token), and another, and so on until it has completed a reply. It might seem odd that the responses are coherent, but this is a feature of how all LLM's work - each token that comes before influences the probability for the next token, and so the model generates a response containing words, phrases, code, etc. in a way that appears humanlike but in fact is solely a creation of the model. It’s just that in this case, the output started in a random (but likely) place and the responses were generated without any input. Our text models display the same behavior if you send an empty query, or you can try it yourself by directly sampling an open source model without any inputs.
We took a while to respond to this. Our goal is to provide a reasonable response to reports. If you have found a security vulnerability, we encourage you to report it via our bug bounty program: https://bugcrowd.com/engagements/openai.
> If you have found a security vulnerability, we encourage you to report it via our bug bounty program
It seems like reporting bugs/issues via that program forces you to sign a permanent NDA preventing disclosures after the reported issue been fixed. I'm guessing the author of this disclosure isn't the only one that avoided it because of the NDA. Is that potentially something you can reconsider? Otherwise you'll probably continue to see people disclosing these things publicly and as a OpenAI user it sounds like a troublesome approach.
Thank you, I made an update to the original post with your explanation, and because you stated that the output was a pure hallucination, I also attached one of them.
Reported a flaw to OpenAI that lets users peek at others' chat responses. Got an auto-reply on May 29th, radio silence since. Issue remains unpatched :(
Avoided their bug bounty due to permanent NDAs preventing disclosure even after fixes. Following standard 45-day disclosure window—users should avoid sharing sensitive data until this is resolved.
> The leaked responses show clear signs of being real conversations: they start with contextually appropriate replies, sometimes reference the original user question, appear in various languages, and maintain coherent conversational flow. This pattern is inconsistent with random model hallucinations but matches exactly what you'd expect from misdirected user sessions.
A model like GPT-4o can hallucinated responses that are indistinguishable from real user interactions. This is easy to confirm for yourself: just ask it to make one up.
I’m certainly willing to believe OpenAI leaks real user messages, but this is not proof of that claim.
In one of the responses, it provided the financial analysis of a not well-known company with a non-Latin name located in a small country. I found this company; it is real and numbers in the response are real. When I asked my ChatGPT to provide a financial report for this company without using web tools, it responded: `Unfortunately, I don’t have specific financial statements for “xxx” for 2021 and 2022 in my training data, and since you’ve asked not to use web search, I can’t pull them live.`.
GPT-4o's writing style is so specific that I find it hard to believe it could fake a user query.
You can spot anyone using AI writing a mile away. It stopped saying "delve" but started saying stuff like "It's not X–it's Y" and "check out the vibes (string of wacky emoji)" constantly.
> I am issuing this limited, non‑technical disclosure:
> No exploit code, proof‑of‑concept, or reproduction steps are included here.
Then why bother? I feel a bit cynical here, but if the goal is to get this fixed, they're not going to care unless it becomes a zero day and is given to the masses, otherwise it's going to quietly be exploitable by the few unsavory groups who know of it and will never be patched. Isn't the whole point of responsible disclosures to give them a time clock to get this situated before actual publication? Forgive me if I'm wrong, I haven't been in that field in a long time.
It adds some pressure, we know now what the bug is about so we can guess which endpoints to poke at, then it's only a matter of time before it leaks. It would be unethical for the researcher to just publish it.
Reminds me of a time I found a serious issue with mailgun. Messaged them, no reply. Had to spam their twitter to get a response. Basically you could have stolen tons of API keys from users without their knowledge and mailgun never disclosed it.
I could have actually gone to their office in person if I wanted to be pedantic but it actually seemed like a pretty weird office space lol.
I don't think disclosure of reported security issues is really a norm, unless the firm finds evidence the bug was exploited (by someone other than the reporter). It's a good thing to do, but I think the majority of stuff that gets reported everywhere is never disclosed --- with the major and obvious exception of consumer or commercial software that needs to be updated "on prem".
> A single misconfiguration can leak thousands of sensitive conversations in seconds. Treating privacy as an afterthought is untenable when the blast radius is this large.
Massive security bug, well spotted. It's like Bank of America showing other people my transactions, or Meta leaking my WhatsApp messages.
This raises some serious questions about security.
I believe it is extremely important to disclose that the ‘responses leaks’ you obtained did not originate from LLM models themselves, but rather through other insecure systems / in a more conventional manner.
Just to avoid yet another case of hallucinations outputs getting misinterpreted.
novia|7 months ago
winstonhowes|7 months ago
The original report was that submitting a message close to (but not quite) 1500 seconds to the audio transcription API would result in weird, unrelated, off-topic responses that look like they might be replies to someone else’s query. This is not what’s happening. Our API has a bug where if the tokenization of the audio (which is not strictly correlated with the audio length) exceeds a limit, the entire input is truncated, and the model effectively receives a blank query. We’re working with our API team to get this fixed and to produce more useful error messages.
When the model receives an empty query, it generates a response by selecting one random token, then another (which is influenced by the first token), and another, and so on until it has completed a reply. It might seem odd that the responses are coherent, but this is a feature of how all LLM's work - each token that comes before influences the probability for the next token, and so the model generates a response containing words, phrases, code, etc. in a way that appears humanlike but in fact is solely a creation of the model. It’s just that in this case, the output started in a random (but likely) place and the responses were generated without any input. Our text models display the same behavior if you send an empty query, or you can try it yourself by directly sampling an open source model without any inputs.
We took a while to respond to this. Our goal is to provide a reasonable response to reports. If you have found a security vulnerability, we encourage you to report it via our bug bounty program: https://bugcrowd.com/engagements/openai.
diggan|7 months ago
It seems like reporting bugs/issues via that program forces you to sign a permanent NDA preventing disclosures after the reported issue been fixed. I'm guessing the author of this disclosure isn't the only one that avoided it because of the NDA. Is that potentially something you can reconsider? Otherwise you'll probably continue to see people disclosing these things publicly and as a OpenAI user it sounds like a troublesome approach.
unknown|7 months ago
[deleted]
requilence|7 months ago
unknown|7 months ago
[deleted]
requilence|7 months ago
jonrouach|7 months ago
https://jarbon.medium.com/gpt-prompt-bug-94322a96c574
999900000999|7 months ago
A lot of AI products straight up have plan text logs available for everyone at the company to view.
poniko|7 months ago
com2kid|7 months ago
Software quality is... Minimal now days.
fcpguru|7 months ago
maxlin|7 months ago
unknown|7 months ago
[deleted]
thorum|7 months ago
A model like GPT-4o can hallucinated responses that are indistinguishable from real user interactions. This is easy to confirm for yourself: just ask it to make one up.
I’m certainly willing to believe OpenAI leaks real user messages, but this is not proof of that claim.
requilence|7 months ago
robertclaus|7 months ago
astrange|7 months ago
You can spot anyone using AI writing a mile away. It stopped saying "delve" but started saying stuff like "It's not X–it's Y" and "check out the vibes (string of wacky emoji)" constantly.
ajdude|7 months ago
tptacek|7 months ago
lyu07282|7 months ago
Eduard|7 months ago
For real? At least doesn't match the one on https://keybase.io/requilence
robswc|7 months ago
I could have actually gone to their office in person if I wanted to be pedantic but it actually seemed like a pretty weird office space lol.
tptacek|7 months ago
jofzar|7 months ago
requilence|7 months ago
pyman|7 months ago
Massive security bug, well spotted. It's like Bank of America showing other people my transactions, or Meta leaking my WhatsApp messages.
This raises some serious questions about security.
blibble|7 months ago
I certainly wouldn't sign an indefinite NDA for a chance to win:
Average payout: $836.36
openai should be grateful, after all, they want all information to be free
JyB|7 months ago
Just to avoid yet another case of hallucinations outputs getting misinterpreted.
requilence|7 months ago
unknown|7 months ago
[deleted]
rglover|7 months ago
Eduard|7 months ago