Regarding metric and dimension - it is really problem dependent as is throughput. Recall and latency numbers reported in benchmarks are typically on very well curated and structured datasets and average across all queries. Recall is not just a function of the HNSW algorithm. I can tell you though you can do 70M vector indexes with 768 dimensions <100ms including inference on very real world datasets. We will publish some benchmarks shortly as we are doing more evaluations on real world data. I also compiled throughput on open CLIP models here as well https://docs.google.com/spreadsheets/d/1ftHKf4MovnAyKhGyi05e....
If there are particular things you want to see let us know and we can add them!
loxias|2 years ago
This is correct. :) Don't worry, I know enough to not trust any published benchmarks on this topic... (I'm also not your target market. I wrote my first "vector DB" in 2001 for music recognition.)
I still think it's crucial to include just a few more facts though, because otherwise the statement is meaningless.
Consider:
A. "we can find an approximate NN match, euclidean, D=768, N=70000000, under 100ms on a modern laptop"
vs
B. "we can find an approximate NN match, euclidean, D=2, N=70000000, under 100ms on a modern laptop"
vs
C. "we can find an approximate NN match, euclidean, D=768, N=70000000, under 100ms on 1000x modern laptops"
Notice how B and C aren't impressive, they're trivially beatable. :)