top | item 44855007

(no title)

cfors | 6 months ago

Sure they can handle the basic case of ANN. But ANN still doesn’t have good stories for lots of real-world problems.

* filterable ANN, decomposes into prefiltering or postfiltering.

* dynamic updates and versioning is still very difficult

* slow building of graph indexes

* adding other signals into the search, such as query time boosting for recent docs.

I don’t disagree these systems can work but innovation is still necessary. We are not in a “data stores are solved” world.

discuss

order

whakim|6 months ago

* Filterable ANN certainly decomposes into pre- and post-filtering, and there is definitely a lot of interesting innovation occurring around filterable ANN. But large-scale search systems currently do a pretty good job with pre-filtering, falling back to brute force search in the case of restrictive filters.

* You'd have to be a bit more exact re: dynamic updates/versioning for me to understand the challenges you're facing.

* Building graph indices can be slow, but in my experience (billions of embeddings) it is possible to build HNSW indices in tens of minutes.

* How is this any different to combining traditional keyword search with, say, recency boosting?

cfors|6 months ago

Might be missing my argument here - I stated that there are workable solutions to this like you have pointed out.

But ANN search is still a sledgehammer and building out hybrid solutions that help bridge the gap between this and traditional data stores still have room for innovation.