(no title)
xfalcox | 3 months ago
We do at Discourse, in thousands of databases, and it's leveraged in most of the billions of page views we serve.
> Pre- vs. Post-Filtering (or: why you need to become a query planner expert)
This was fixed in version 0.8.0 via Iterative Scans (https://github.com/pgvector/pgvector?tab=readme-ov-file#iter...)
> Just use a real vector database
If you are running a single service that may be an easier sell, but it's not a silver bullet.
xfalcox|3 months ago
- halfvec (16bit float) for storage - bit (binary vectors) for indexes
Which makes the storage cost and on-going performance good enough that we could enable this in all our hosting.
simonw|3 months ago
For anyone who hasn't seen it yet: it turns out many embedding vectors of e.g. 1024 floating point numbers can be reduced to a single bit per value that records if it's higher or lower than 0... and in this reduced form much of the embedding math still works!
This means you can e.g. filter to the top 100 using extremely memory efficient and fast bit vectors, then run a more expensive distance calculation against those top 100 with the full floating point vectors to pick the top 10.
summarity|3 months ago
mfrye0|3 months ago
whakim|3 months ago
On the iterative scan side, how do you prevent this from becoming too computationally intensive with a restrictive pre-filter, or simply not working at all? We use Vespa, which means effectively doing a map-reduce across all of our nodes; the effective number of graph traversals to do is smaller, and the computational burden mostly involves scanning posting lists on a per-node basis. I imagine to do something similar in postgres, you'd need sharded tables, and complicated application logic to control what you're actually searching.
How do you deal with re-indexing and/or denormalizing metadata for filtering? Do you simply accept that it'll take hours or days?
I agree with you, however, that vector databases are not a panacea (although they do remove a huge amount of devops work, which is worth a lot!). Vespa supports filtering across parent-child relationships (like a relational database) which means we don't have to reindex a trillion things every time we want to add a new type of filter, which with a previous vector database vendor we used took us almost a week.
xfalcox|3 months ago
I can totally see that at a trillion scale for a single shard you want a specialized dedicated service, but that is also true for most things in tech when you get to the extreme scale .
tacoooooooo|3 months ago
iterative scans are more of a bandaid for filtering than a solution. you will still run into issues with highly restrictive filters. you still need to understand ef_search and max_search_tuples. strict vs relaxed ordering, etc. it's an improvement for sure, but the planner still doesn't deeply understand the cost model of filtered vector search
there isn't a general solution to the pre- vs post-filter problem—it comes down to having a smart planner that understands your data distribution. question is whether you have the resources to build and tune that yourself or want to offload it to a service that's able to focus on it directly
cortesoft|3 months ago
jascha_eng|3 months ago
In theory these can be more efficient than plain pre/post filtering.
tacoooooooo|3 months ago
dpflan|3 months ago
xfalcox|3 months ago
- Related Topics, a list of topics to read next, which uses embeddings of the current topic as the key to search for similar ones
- Suggesting tags and categories when composing a new topic
- Augmented search
- RAG for uploaded files