top | item 44641655

(no title)

etk934 | 7 months ago

Can you report the relative storage requirements for multivector COLPALI vs multivector COPALI with binary vectors vs MUVERA vs a single vector per page? Can your system scale to millions of vectors?

discuss

ArnavAgrawal03|7 months ago

Yes! We have a use case in production with over a million pages. MUVERA is good for this, since it is basically akin to regular vector search + re-ranking.

In our current setup, we have the multivectors stored as .npy in S3 Express storage. We use Turbopuffer for the vector search + filtering part. Pre-warming the namespace, and pre-fetching the most common vectors from S3 means that the search latency is almost indistinguishable from regular vector search.

ColPali with binary vectors worked fine, but to be honest there have been so many specific improvements to single vectors that switching to MUVERA gave us a huge boost.

Regular multivector ColPali also suffers from a similar issue. Chamfer distance is just hard to compute at scale. Plaid is a good solution if your corpus is constant. If it isn't, using the regular mulitvector ColPali as a re-ranking step is a good bet.