(no title)
cfors | 6 months ago
Especially in the context of embedding search, which this article is also trying to do. We need database that can efficiently store/query high-dimensional embeddings, and handle the nuance of real-world applications as well such as filtered-ANN. There is a ton of innovation in this space and it's crucial to powering the next generation architectures of just about every company out there. At this point, data-stores are becoming a bottleneck for serving embedding search and I cannot understate that advancements in this are extremely important for enabling these solutions. This is why there is an explosion of vector-databases right now.
This article is a great example of where the actual data-providers are not providing the solutions companies need right now, and there is so much room for improvement in this space.
whakim|6 months ago
cfors|6 months ago
* filterable ANN, decomposes into prefiltering or postfiltering.
* dynamic updates and versioning is still very difficult
* slow building of graph indexes
* adding other signals into the search, such as query time boosting for recent docs.
I don’t disagree these systems can work but innovation is still necessary. We are not in a “data stores are solved” world.
mdaniel|6 months ago
Oh, then you must have the secret sauce that allows scaling ES vector search beyond 10,000 results without requiring infinite RAM. I know their forums would welcome it, because that question comes up a lot
Or I guess that's why you included the qualifier about money to invest