With binary representations you still get 2^D possible configurations so its entirely possible from a representation perspective. The main issue (I think at least) is around determining the similarity. Hamming distance gives an output space of D possible scores. As mentioned in the article, going to 0/1 with cosine gives better granularity as it now penalizes embeddings if they have differing amounts of positive elements in the embedding (i.e. living on different hyper-spheres). It is probably well suited to retrieval where there is a 1:1 correspondence for query-document but if the degeneracy of queries is large then there could be issues discriminating between similar documents. Regimes of binary and (small) dense embeddings could be quite good. I expect a lot more innovation in this space.
No comments yet.