top | item 30312977

(no title)

kmonad | 4 years ago

> the lower average distance has indeed decreased

Yes, that was indeed the point I was trying to convey! In an even sillier example, assume word vectors X, then calculate "proposal by proposal" similarities (i.e. inverse distances). Then duplicated X and concatenate [X,X], recalculate "proposal by proposal" distances (now for twice as many proposals)---those distances must now be less on average because each proposal has at least one "zero distance" neighbor. HOWEVER, why would you assert that the overall "idea space" has been reduced?

discuss

dash2|4 years ago

Here's one metric by which, in your first model, the overall idea space has been reduced: the distribution of models has become more concentrated. That's because 100 proposals are tiny variations around 1 basic one. The same holds in your second model: with word vectors X, if I pick (say) two ideas to fund at random, they will never be the same idea, while with (X, X), that will sometimes happen.