Yes, one more precise way to phrase this is that the expected value of the dot product between two random vectors chosen from a vector space tends towards 0 as the dimension tends to infinity (I think the scaling is 1/sqrt(dimension)). But the probability of drawing two truly orthogonal vectors at random (over the reals) is zero - the dot product will be very small but nonzero.
That said, for sparse high dimensional datasets, which aren't proper vector spaces, the probability of being truly orthogonal can be quite high - e.g. if half your vectors have totally disjoint support from the other half then the probability is at least 50-50.
Note that ML/LLM practioners use "approximate orthogonality" anyway.
The visualization is useless. IF the 2D embeddings were any good they might be useful to R1's developers but still not to end users. What am I supposed to with it?
frizkie|1 year ago
I have mostly a laypersons understanding of this idea but I would assume that it would be false to say that they are typically _entirely_ orthogonal?
aithrowawaycomm|1 year ago
That said, for sparse high dimensional datasets, which aren't proper vector spaces, the probability of being truly orthogonal can be quite high - e.g. if half your vectors have totally disjoint support from the other half then the probability is at least 50-50.
Note that ML/LLM practioners use "approximate orthogonality" anyway.
viraptor|1 year ago
esafak|1 year ago
higuidebot|1 year ago
TaurenHunter|1 year ago
dehrmann|1 year ago