top | item 43338388

(no title)

pheeney | 11 months ago

I'd be curious too. It sounds like standard RAG, just in the opposite direction than usual. Summary > Facts > Vector DB > Facts + Source Documents to LLM which gets scored to confirm the facts. The source documents would need to be natural language though to work well with vector search right? Not sure how they would handle that part to ensure something like "Patient X was diagnosed with X in 2001" existed for the vector search to confirm it without using LLMs which could hallucinate at that step.

discuss

order

social_quotient|11 months ago

I think you’re spot on!

We’re using a similar trick in our system to keep sensitive info from leaking… specifically, to stop our system prompt from leaking. We take the LLM’s output and run it through a RAG search, similarity search it against our actual system prompt/embedding of it. If the similarity score spikes too high, we toss the response out.

It’s a twist on the reverse RAG idea from the article and maybe directionally what they are doing.

jcuenod|11 months ago

If you're trying to prevent your prompt from leaking, why don't you just use string matching?

soulofmischief|11 months ago

Are you able to still support streaming with this technique? Have you compared this technique with a standard two-pass LLM strategy where the second pass is instructed to flag anything related to its context?

qudat|11 months ago

So if they are using a pretrained model and the second llm scores all responses below the ok threshold what happens?