top | item 44578849

(no title)

Hi all, I work on security at OpenAI. We have looked into this report and the model response does not contain outputs from any other users nor does it reflect a security vulnerability, compromise, or exploit.

The original report was that submitting a message close to (but not quite) 1500 seconds to the audio transcription API would result in weird, unrelated, off-topic responses that look like they might be replies to someone else’s query. This is not what’s happening. Our API has a bug where if the tokenization of the audio (which is not strictly correlated with the audio length) exceeds a limit, the entire input is truncated, and the model effectively receives a blank query. We’re working with our API team to get this fixed and to produce more useful error messages.

When the model receives an empty query, it generates a response by selecting one random token, then another (which is influenced by the first token), and another, and so on until it has completed a reply. It might seem odd that the responses are coherent, but this is a feature of how all LLM's work - each token that comes before influences the probability for the next token, and so the model generates a response containing words, phrases, code, etc. in a way that appears humanlike but in fact is solely a creation of the model. It’s just that in this case, the output started in a random (but likely) place and the responses were generated without any input. Our text models display the same behavior if you send an empty query, or you can try it yourself by directly sampling an open source model without any inputs.

We took a while to respond to this. Our goal is to provide a reasonable response to reports. If you have found a security vulnerability, we encourage you to report it via our bug bounty program: https://bugcrowd.com/engagements/openai.

discuss

diggan|7 months ago

> If you have found a security vulnerability, we encourage you to report it via our bug bounty program

It seems like reporting bugs/issues via that program forces you to sign a permanent NDA preventing disclosures after the reported issue been fixed. I'm guessing the author of this disclosure isn't the only one that avoided it because of the NDA. Is that potentially something you can reconsider? Otherwise you'll probably continue to see people disclosing these things publicly and as a OpenAI user it sounds like a troublesome approach.

ragona|7 months ago

(Note; I also work for OpenAI Security — though I’ve not worked on our bounty program for some time. These just my thoughts and experiences.)

I believe the author was referring to the standard BugCrowd terms, which as far as I know are themselves fairly common across the various platforms. In my experience we are happy for researchers to publish their work within the normal guidelines you’d expect from a bounty program — it’s something I’ve worked with researchers on without incident.

unknown|7 months ago

[deleted]

requilence|7 months ago

Thank you, I made an update to the original post with your explanation, and because you stated that the output was a pure hallucination, I also attached one of them.

unknown|7 months ago

[deleted]