top | item 40462376

(no title)

If you save 32x memory with binarization, why not do a projection to a larger dimension? Say 4096 for instance. Could this actually improve performance WHILE reducing memory?

discuss

jn2clark|1 year ago

That's a great question. I think regimes like that could offer better trade-offs of memory/latency/retrieval performance, although I don't know what they are right now. It also assumes that going to the larger dimensions can preserve more of the full-precision performance which is still TBD. The other thing is how the binary embeddings play with ANN algorithms like HNSW (i.e. recall). With hamming distance the space of similarity scores is quite limited.