top | item 40292841

(no title)

Thanks for the thought-provoking comment.

It's all grey isn't it? Vanilla RAG is a big step along the spectrum from LLM towards search, DQ is perhaps another small step. I'm no expert in search but I've read that those systems coming from the other direction, perhaps they'll meet in the middle.

There are three "lookups" in a system with DQ: (1) The original top-k chunk extraction (in the minimalist implementation, that's unchanged from vanilla RAG, just a vector embeddings match) (2) the LLM call, which takes its pick from 1, and (3) the call-back deterministic lookup after the LLM has written its answer.

(3) is much more bounded, because it's only working with those top-k, at least for today's context constrained systems.

In any case, another way to think of DQ is a "band-aid" that can sit on top of that, essentially a "UX feature", until the underlying systems improve enough.

I also agree about the importance of chunk-size. It has "non-linear" effects on UX.

discuss

No comments yet.