Can anyone give a brief mathematical derivation on how to calculate principal components in PCA for a given covariance matrix let's say - \begin{pmatrix} 5 & 2\\ 2 & 5 \end{pmatrix} ?
-
See the end of my post at https://stats.stackexchange.com/a/252043/919 for its application to PCA; then consider reading the initial part of it to learn several ways to compute the PCA of any $2\times 2$ covariance matrix. – whuber Mar 14 '22 at 14:30
1 Answers
What are Principal Components ?
The vectors such that the variance along the direction of vector is maximized. These are called orthonormal basis vectors in linear algebra. And these vectors are nothing but eigen vectors of covariance matrix. So, when you want to calculate Principal components, you actually need to calcuate eigevector of covariance matirx.
To calculate eigenvectors you have 2 options:
Either you calculate the eigenvalues using |M - I.λ|=0 & corresponding eigenvector by solving the equation M.V = λ.V , where (M: Matrix, λ: eigenvalue & V: eigenvector)
Or perform singular value decomposition of matrix M such that M = A.D.BT, where each column of A represents the eigenvectors & diagonal elements of diagonal matrix D represents the corresponding eigenvalue.
Each eigenvector is a principal component & the eigenvector corresponding to the largest eigenvalue is the principal component with the most variance along its direction.
- 13
- 3
-
1Many numerical routines actually implement a third option: directly maximize the variance of unit vectors. For a $2\times 2$ covariance matrix this is elementary, because the unit vectors are parameterized by the circular functions $(\cos,\sin),$ leading to the problem of optimizing $$\pmatrix{\cos\theta&\sin\theta}\pmatrix{5&2\2&5}\pmatrix{\cos\theta\\sin\theta}=5/2+2\sin2\theta$$ whose maxima and minima are well known to occur at $\pi/2+k\pi$ for integers $k,$ yielding one maximum and one minimum. – whuber Mar 15 '22 at 21:41
-
Yeah, that's another option. But in case we have a matrix of dimensions more than 2x2, then with this approach, we can only find only 2 principal components (min & ma variance one's). To find more than 2 principal components, we are gonna need another approach. – Piyush Gupta Mar 15 '22 at 21:55
-
2It always helps to answer the question that was asked. For the generalization, yes, one uses numerical optimization: but that optimization method is exactly what I am illustrating here. It produces all the PCs sequentially by maximizing the variance at each step, subject to the constraint that the vectors are orthogonal to the PCs that were previously found. In particular, one never finds PCs by starting with the characteristic polynomial except in small textbook problems; and the SVD method is best applied to the data matrix rather than the covariance matrix. – whuber Mar 16 '22 at 12:44
-
Got your point & thanks for introducing this approach. It would be great if you can provide some references so that I can learn this approach a little deeper. – Piyush Gupta Mar 16 '22 at 13:00
-
1This is described in numerical analysis textbooks. For an on-line introduction, you will find that the Wikipedia article on PCA ultimately directs you to the article on computing eigenvalues and eigenvectors (no surprise), so that would be a good start. – whuber Mar 16 '22 at 13:04