top | item 40243906

(no title)

barakm | 1 year ago

> Thy are binary vectors with 768 dimensions, which takes up 96 bytes (768 / 8 = 96).

I guess I’m confused. This is honestly the problem that most vector storage faces (“curse of dimensionality”) let alone the indexing.

I assume that you meant 768 dimensions * 8 bytes (for a f64) which is 6144 bytes. Usually, these get shrunk with some (hopefully minor) loss, so like a f32 or f16 (or smaller!).

If you can post how you fit 768 dimensions in 96 bytes, even with compression or trie-equivalent amortization, or whatever… I’d love to hear more about that for another post.

Ninja edit: Unless you’re treating each dimension as one-bit? But then I still have questions around retrieval quality

discuss

alexgarcia-xyz|1 year ago

Author here - ya "binary vectors" means quantizing to one bit per dimension. Normally it would be 4 * dimensions bytes of space per vector (where 4=sizeof(float)). Some embedding models, like nomic v1.5[0] and mixedbread's new model[1] are specifically trained to retain quality after binary quantization. Not all models do tho, so results may vary. I think in general for really large vectors, like OpenAI's large embeddings model with 3072 dimensions, it kindof works, even if they didn't specifically train for it.

[0] https://twitter.com/nomic_ai/status/1769837800793243687

[1] https://www.mixedbread.ai/blog/binary-mrl

barakm|1 year ago

Thank you! As you keep posting your progress, and I hope you do, adding these references would probably help warding off crusty fuddy-duddys like me (or at least give them more to research either way) ;)

lsb|1 year ago

Binary, quantize each dimension to +1 or -1

You can try out binary vectors, in comparison to quantize every pair of vectors to one of four values, and a lot more, by using a FAISS index on your data, and using Product Quantization (like PQ768x1 for binary features in this case) https://github.com/facebookresearch/faiss/wiki/The-index-fac...

barakm|1 year ago

Appreciate the link; but would still like to know how well it works.

If you have a link for that, I’d be much obliged

queuebert|1 year ago

BTW, the "curse of dimensionality" technically refers to the relative sparsity of high-dimensional space and the need for geometrically increasing data to fill it. It has nothing to do with storage. And typically in vector databases the data are compressed/projected into a lower dimensionality space before storage, which actually improves the situation.