top | item 40751121

(no title)

hackernoteng | 1 year ago

semantic search doesn't really work. Looks nice in little POCs. With real world data it's sort of a mixed bag. Embeddings are just to general, and documents within a narrow domain (finance, medical, etc.) are all "similar"

discuss

order

smarvin2|1 year ago

I think there are a couple things worth noting here:

Semantic search performs well at capturing documents keyword search misses. As noted in the article, when searching for exact keywords, keyword will outperform semantic search. It is when users do not know the exact phrase they are looking for that keyword search shines.

Semantic search should only be a part of your search system, not your entire search system. We find that combining keyword search + semantic search and then using a reranker gives the best results. It is best if the reranker is fine tuned on your search history, but general crossencoders perform surprisingly well.