I am using SVD/PCA for text mining purposes.
Having a $(|terms|,|documents|)$ normalized matrix $M$, by applying SVD, I should be able to reduce the dimensionality and just keep the most meaningful dimensions.
By truncating the SVD to 2 components, $U_2$ and $V^T_2$ should contain the 2-dimensional spatial representation of terms and documents. This should tell me which terms are closer to which documents:

I've seen several examples where only $U$ is visualized, so I'm not sure that my idea of plotting documents is correct. This said, I've also seen that most of PCA implementation return $U\cdot\Sigma$, so this makes me wonder:
- Is this idea correct?
- Should I perform a dot product with $\Sigma$ on $U$ and/or $V^T$?
- Why are some documents so distant from the words, since they surely contain at least one of them?
biplot. Among what you can find, my own, sufficiently detailed and dense answer with pictures is here (start with pictures, to become involved). – ttnphns Aug 13 '15 at 13:26