top | item 35827697

(no title)

sir_eliah | 2 years ago

Does anyone have some real, production experience with milvus? I'm interested this database performs in larger scale. Let's say, you have millions of vectors and traffic reaching thousands requests/s.

discuss

bluecoconut|2 years ago

Not production, but yes to scale: I pushed milvus to ~140 million vectors (768 dimension) (though only a handful of requests per second (~10)), and it faired alright once everything was up and running and relatively static on the document side. Rebuilding indexes and stability were a bit of a hassle at times (I was live adding more documents to it ~1 million per 30 minutes) and it would occasionally fall over and need to rebuild, subsequently causing a lot more load, rejecting new documents, etc.). Probably lots of tuning I probably could have done to eek out more performance and stability though. Ended up being hours of effort on the rebuilds and lots of careful management of RAM (on a 300 GB RAM machine)

for the scale you are saying "larger scale": At the few million documents scale I would just suggest using just any libary, eg. `hnsw` in `nmslib` or `faiss`.

I just did some benchmarks with 1M docs, `cosinesimil_sparse` on `78628` dimensional binary vectors (nmslib `hnsw`) -> 30 seconds to build the index, and can process a batch of 100 document query in 3ms (Each with 100 KNN). Based on this question, i just put a loop over it and it handled 1000 random queries (non batched) in 1.11 seconds. (~1 GB peak RAM usage, and using 24 threads)

All in all, my personal opinion is: even up to few "millions" scale, i'm finding using the underlying libraries (`faiss` and `nmslib`) significantly easier than using the wrapper tools / databases (milvus and pinecone). I don't really get the point of a separate piece of infra for something that is essentially ~15 lines of python at most scales that matter (~few millions). (Note, in the ~10k-100k scale or less, simple numpy and sort seems to be fast enough (and exact) or just exact NN w/ sklearn.neighbors)... And when you push to scales that it does start breaking (100 million+), then the database versions seem to break as well (and require fiddling with lots of bespoke config)

sir_eliah|2 years ago

Thanks for the input! I asked about the scale of items and traffic, because my use case actually requires separate piece of infrastructure. It's around 100 millions of items and live production traffic from millions of users with high latency demand. So it's not a batch job that can be performed in memory, as I understand your case.

Currently I use Elasticsearch with the Open Distro approximate kNN plugin by the way.

peterstjohn|2 years ago

Yes! We've been running Milvus in production for about three years now, powering some customers that do have queries at that scale. It has its foibles like all of these systems (the lack of non-int id fields in the 1.x line is maddening and has required a bunch of additional engineering by us to work with our other systems), but it has held up pretty well in our experience.

(I can't speak to Milvus 2.x as we are probably not going to upgrade to that for a number of non-performance reasons)