top | item 15907139

(no title)

PCA/SVD aim for maximizing explained variance, not preserving distance. They tend to "preserve" large distances at the expense of smaller ones, but that's not an explicit goal, nor can you bound the distortion. The [answer here](https://stats.stackexchange.com/a/176801/60) gives a pretty good intuition about why.

You can also compare them on computational complexity, where random projection (O(numPoints * numOriginalDimensions * numProjectedDimensions) smokes PCA or SVD which are cubic in the number of original dimensions.

And then there's simplicity. The random projection method turns on sampling from a normal distribution and then doing a matrix multiplication. There's a whole lot more about PCA to understand (standardizing your data, calculating the covariance matrix, eigenvector decomposition). I doubt I could implement it correctly myself, and I surely couldn't do it in high dimension.

discuss

No comments yet.