I have a data matrix $ X $ that is $n \times m$, where $n$ is the number of features and $m$ is the number of samples and $ n < m$. Let the Singular Value Decomposition (SVD) of $X$ be $$ X = U \Sigma V^T $$
In conventional preprocessing for Independent Component Analysis (ICA), Principal Component Analysis (PCA) is employed on the covariance matrix formed by $$ XX^T = U \Sigma V^T V \Sigma^T U^T = U \Sigma^2 U^T $$ The whitening matrix is then formed from the principal components and eigenvalues $$ W = \Sigma^{2^{-\frac{1}{2}}}U^T $$ Then the data is projected onto the whitening matrix $$ WX $$ This projection is iteratively rotated by an orthogonal matrix until independent components are obtained, as described in A Tutorial on Independent Component Analysis
I was wondering why the principal components and eigenvalues are used to whiten the data rather than just using the right singular vectors directly. My initial thought was that rotating the whitened data $ WX $ should be equivalent to rotating the right singular vectors $ V^T $, as I believed the whitened data should be the same as the right singular vectors: $$ WX = \Sigma^{2^{-\frac{1}{2}}}U^TU \Sigma V^T = \Sigma^{-1} \Sigma V^T = V^T $$
My guess as to why this is not true is maybe due to the estimation of the covariance matrix or the stability of the right singular vectors given $ n < m$, but I am not sure. Additionally, I think it might be related to this post but am not positive Relationship between SVD and PCA. How to use SVD to perform PCA?. Section 5.3 Dual Principal Component Analysis of Elements Of Dimensionality Reduction and Manifold Learning also seems related in that they show how you can obtain the projection by using a combination of the singular values and right singular vectors.