One of the topics that could really be improved with a 3Blue1Brown treatment; but a diagram would do at a pinch - such as what Wikipedia has [0]. Looks like a good writeup nonetheless.
The Wikipedia picture doesn't quite convey the message though - PCA is a variable reduction technique. The Wiki picture could reduce the data from 2 dimensions (as shown) to 1 dimension (distance along the axis that you'd intuitively pick). PCA will let you pick the axis with minimum loss of information.
PCA is most useful when the data is really generated in a low number of changing dimensions (say, plant species & soil nutrients in a controlled experiment) but the data collected is high dimension (length, weight, colour, smell, no. pests, flavour rating by a team of chefs). PCA will tell you that there are really 2 important variables and how to construct them from the observed data - but it won't tell you what they are.
> Positive Semi-Definite Matrix
Aka, it has a real square root. Except not related to the concept of square roots or reals in any obviously meaningful way. The concept annoys me, somehow. It is a pity we have to learn algebra before matrices are meaningful.
Caveat emptor, I'm bad at stats, corrections welcome.
Matrix square roots are related to square roots of real numbers: say that the matrix B is a square root of the matrix A if B*B = A. If A is positive semidefinite, then it admits a unique positive semidefinite square root in the same way that a positive real number admits a positive real square root.
So I would say that the notion of matrix square roots is as obviously related to the notion of real square roots as possible.
A perverse thought pertaining to matrix square-root: [0] I see the notation of M^(1/2). Could the idea be meaningfully extended to define, say, M^(0.7), similar to the way this is meaningful with real numbers?
[+] [-] roenxi|5 years ago|reply
The Wikipedia picture doesn't quite convey the message though - PCA is a variable reduction technique. The Wiki picture could reduce the data from 2 dimensions (as shown) to 1 dimension (distance along the axis that you'd intuitively pick). PCA will let you pick the axis with minimum loss of information.
PCA is most useful when the data is really generated in a low number of changing dimensions (say, plant species & soil nutrients in a controlled experiment) but the data collected is high dimension (length, weight, colour, smell, no. pests, flavour rating by a team of chefs). PCA will tell you that there are really 2 important variables and how to construct them from the observed data - but it won't tell you what they are.
> Positive Semi-Definite Matrix
Aka, it has a real square root. Except not related to the concept of square roots or reals in any obviously meaningful way. The concept annoys me, somehow. It is a pity we have to learn algebra before matrices are meaningful.
Caveat emptor, I'm bad at stats, corrections welcome.
[0] https://en.wikipedia.org/wiki/File:GaussianScatterPCA.svg
[+] [-] joppy|5 years ago|reply
So I would say that the notion of matrix square roots is as obviously related to the notion of real square roots as possible.
[+] [-] MaxBarraclough|5 years ago|reply
[0] https://en.wikipedia.org/wiki/Definite_symmetric_matrix#Squa...
[+] [-] throwaway4747l|5 years ago|reply