How are PCA and SVD related?

[+] twelfthnight|8 years ago|reply

For those looking for a more succinct answer: https://stats.stackexchange.com/questions/134282/relationshi...

And here is another interesting connection between PCA and ridge regression: https://stats.stackexchange.com/questions/81395/relationship...

[+] vcdimension|8 years ago|reply

I don't understand why people create these webpages just re-explaining stuff that can be read in a book, lecture notes (usually available freely online), or wikipedia. It just adds more noise to the internet. Is it a kind of marketing thing to show their customers that they know what they are doing?

[+] howscrewedami|8 years ago|reply

There's value in explaining things in a different/more understandable way. Wikipedia articles and book chapters on statistics can be hard to understand.

[+] marcusshannon|8 years ago|reply

Yes it's a form of marketing called inbound marketing. Create content that attracts people (blog posts) and then turn them into leads by getting them to put their email in for more info, etc.

[+] gabrielgoh|8 years ago|reply

6 word answer

PCA is the SVD of A'A

[+] lottin|8 years ago|reply

Actually it's eigendecomposition of A'A and the SVD of A, is it not?

[+] thanatropism|8 years ago|reply

PCA is a statistical model -- the simplest factor model there is. It deals with variances and covariances in datasets. It returns a transformed dataset that's linearly related to the original one but has the first variable with the highest variance and so on.

SVD is a matrix decomposition. It generalizes the idea of representing a linear transformation (with same dimensions in domain and codomain) in the basis of its eigenvalues, which gives a diagonal matrix representation and a formula like A = V'DV.

SVD is like this, but for rectangular matrices. So you have two matrices to diagonalize: A = U'DV.

That SVD even performs PCA as noted in the algorithms is a theorem, albeit simple one usually given as an exercise. But hey, even OLS regression can be programmed with SVD if you want to.

[+] kiernanmcgowan|8 years ago|reply

I've always understood PCA as SVD on a whitened matrix. Is this too simplistic of a view to take wrt implementation?

https://en.m.wikipedia.org/wiki/Whitening_transformation

[+] celerity|8 years ago|reply

I actually touch on the relation to whitening toward the bottom of the article. You can whiten your dataset from the left singular matrix U which is directly related to PCs. Thanks for reading!

[+] popcorncolonel|8 years ago|reply

The connection between these two has always been hazy to me. I often mixed up the two when talking about each of them independently.

This article was well-written, exactly precise enough, and cleared up the confusion. Thanks for sharing!

[+] eggie5|8 years ago|reply

SVD is the decomposition of a matrix into its components.

PCA is the analysis of a set of eigenvectors. Eigenvectors can come from SVD components or a covariance matrix.

source: http://www.eggie5.com/107-svd-and-pca

[+] foxh0und|8 years ago|reply

Great article, the lecture comparing the two from John Hopkins, part of the Data Science Specialization on Coursera also offers a great explanation.

[+] finknotal|8 years ago|reply

"Because vectors are typically written horizontally, we transpose the vectors to write them vertically". Is there a typo in this sentence or is to just too early in the morning for me to read this?

[+] zeapo|8 years ago|reply

No typo there. When we talk about vectors we mean "column vector". As it's easier to read horizontally (and takes less place in a paper), most of the time we write x^T = {a, b, c} rather than writing them in a column shape.

15 comments