top | item 45362878

(no title)

ianbicking | 5 months ago

This looks like RAG...? That's fine, RAG is a very broad approach and there's lots to be done with it. But it's not distinct from RAG.

Searching by embedding is just a way to construct queries, like ILIKE or tsvector. It works pretty nicely, but it's not distinct from SQL given pg_vector/etc.

The more distinctive feature here seems to be some kind of proxy (or monkeypatching?) – is it rewriting prompts on the way out to add memories to the prompt, and creating memories from the incoming responses? That's clever (but I'd never want to deploy that).

From another comment it seems like you are doing an LLM-driven query phase. That's a valid approach in RAG. Maybe these all work together well, but SQL seems like an aside. And it's already how lots of normal RAG or memory systems are built, it doesn't seem particularly unique...?

discuss

mobilemidget|5 months ago

RAG, or Retrieval Augmented Generation, is an AI technique that improves large language models (LLMs) by connecting them to external knowledge bases to retrieve relevant, factual information before generating a response. This approach reduces LLM "hallucinations," provides more accurate and up-to-date answers, and allows for responses grounded in specialized or frequently updated data, increasing trust and relevance.

I was unaware what RAG referred to, perhaps other too.