Why not scale before PCA as a default step?

Question

In ISLR 2nd edition, it says that you may not want to scale before PCA if the features are all in the same units (below). However, I don't see the nuance. Why not just have the "default" step to scale everything to SD = 1, mean = 0, even if everything is the same units and will ultimately have no effect?

Is it because of reduced interpretability? More compute power/ time wasted to scale?

I'm just thinking in terms of just having 1 process that all data goes through, instead of using 2 different processes for "if data needs scaling" vs. "if data doesn't need scaling".

Because its standard practice...? Why does the doctor need to wear gloves before a surgery if he/she has already washed their hands? — Katsu, Jan 23 '23 at 21:41
One good reason to scale may be to avoid numerical issues. Lets say you work in banking, so 10e6 values are common, lets say you have 100 features. The numerical value of covariance can be as large as 1e13, and then you have 1e4 of them - you will start running into numerical overflow issues (float64 is roughly 16 significant figures), e.g. your covariance matrix may be ill-conditioned. Scaling sets a strict bound on the trace of the covariance matrix, for example — Cryo, Jan 25 '23 at 16:05
So sounds like we are in agreement of always scaling, even if theyre all the same units? Even though the screenshot above in ISLR said to not scale the genes that have the same units, the authors themselves still scaled it in the Lab section when they were working with the genetic dataset lol — Katsu, Jan 25 '23 at 18:25

Why not scale before PCA as a default step?

0 Answers0