top | item 45799401

(no title)

VoVAllen | 3 months ago

We at https://github.com/tensorchord/VectorChord solved most of the pgvector issues mentioned in this blog:

- We're IVF + quantization, can support 15x more updates per second comparing to pgvector's HNSW. Insert or delete an element in a posting list is a super light operation comparing to modify a graph (HNSW)

- Our main branch can now index 100M 768-dim vector in 20min with 16vcpu and 32G memory. This enables user to index/reindex in a very efficient way. We'll have a detailed blog about this soon. The core idea is KMeans is just a description of the distribution, so we can do lots of approximation here to accelerate the process.

- For reindex, actually postgres support `CREATE INDEX CONCURRENTLY` or `REINDEX CONCURRENTLY`. User won't experience any data loss or inconsistency during the whole process.

- We support both pre-filtering and post-filtering. Check https://blog.vectorchord.ai/vectorchord-04-faster-postgresql...

- We support hybrid search with BM25 through https://github.com/tensorchord/VectorChord-bm25

The author simplifies the complexity of synchronizing between an existing database and a specialized vector database, as well as how to perform joint queries on them. This is also why we see most users choosing vector solution on PostgreSQL.

discuss

nostrebored|3 months ago

So you’re quantizing and using IVF — what are your recall numbers with actual use cases?

VoVAllen|3 months ago

We do have some benchmark number at https://blog.vectorchord.ai/vector-search-over-postgresql-a-.... It varies on different dataset, but most cases it's 2x or more QPS comparing to pgvector's hnsw at same recall.

VoVAllen|3 months ago

And we do have user hosting 3 Billion vectors with Postgres + VectorChord with sharding. And they're using vectors to save the earth! Check https://blog.vectorchord.ai/3-billion-vectors-in-postgresql-...

tacoooooooo|3 months ago

We actually looked into vectorchord--it looks really cool, but it's not supported by RDS so it is an additional service for us to add anyways.

inadequatespace|3 months ago

Another extremely solid win for Cunningham’s Law.