top | item 40448374

(no title)

> Vector search is almost never the winner relative to full text.

Full text search is certainly the winner in the time dimension, but can it compete in quality? Presumably which method is likely to provide relevant results depends greatly on the query. Invoking LLMs to pre-process the query and select a retrieval method is going to be quite expensive compared to each of the search methods.

discuss

7thpower|1 year ago

I mean from a retrieval quality perspective, not a latency perspective. Search latency is not a constraint because the long pole in the tent for us is always the user facing model.

We also have a lot of numbers in our customer requests, which do not typically play to the strengths of the vector searches.

COGs is not a large concern as our audience is internally facing along with a few of our partners, so inference and infrastructure costs are nothing compared to engineering time as we don’t have a way to amortize our costs out across a bunch of customers.

It is also a very high value use case for us.

The other factor is that we’re using fast and cheap models like haiku and mixtral to do the pre processing before we hand things to the retrieval steps, so it’s not much of a cost driver.