top | item 46324863

(no title)

samwho | 2 months ago

I was wondering about this when I was reading around the topic. I can’t personally think of a reason you would need to segregate, though it wouldn’t surprise me if they do for some sort of compliance reasons. I’m not sure though, would love to hear something first-party.

discuss

order

weird-eye-issue|2 months ago

They absolutely are segregated

With OpenAI at least you can specify the cache key and they even have this in the docs:

Use the prompt_cache_key parameter consistently across requests that share common prefixes. Select a granularity that keeps each unique prefix-prompt_cache_key combination below 15 requests per minute to avoid cache overflow.

ambicapter|2 months ago

> Select a granularity that keeps each unique prefix-prompt_cache_key combination below 15 requests per minute to avoid cache overflow.

Why below a certain number? Usually in caches a high number of requests keeps the cached bit from expiring or being replaced, no?

psadri|2 months ago

Does anyone actually compute / use this key feature? Or do you rely on implicit caching? I wish HN had a comment with a poll feature.

samwho|2 months ago

The only thing that comes to mind is some kind of timing attack. Send loads of requests specific to a company you’re trying to spy on and if it comes back cached you know someone has sent that prompt recently. Expensive attack, though, with a large search space.

gwern|2 months ago

No, the search space is tiny: you can just attack 1 BPE at a time! Stuff like password guessing is almost trivial when you get to do a timing attack on each successive character. So that lets you quickly exfiltrate arbitrary numbers of prompts, especially if you have any idea what you are looking for. (Note that a lot of prompts are already public information, or you can already exfiltrate prompts quite easily from services and start attacking from there...)

gunalx|2 months ago

I habe come across turning on caching means the llm has a faint memory of what was in the cache, even to unrelated queries. If this is the case its fully unreasonable to share the cache, because of possibility of information leakage.

dustfinger|2 months ago

I wonder if there is valuable information that can be learned by studying a companies prompts? There may be reasons why some companies want their prompts private.

dustfinger|2 months ago

I realize cache segregation is mainly about security/compliance and tenant isolation, not protecting secret prompts. Still, if someone obtained access to a company’s prompt templates/system prompts, analyzing them could reveal:

- Product logic / decision rules, such as: when to refund, how to triage tickets

- Internal taxonomies, schemas, or tool interfaces

- Safety and policy guardrails (which adversaries could try to route around)

- Brand voice, strategy, or proprietary workflows

That is just off the top of my head.