I was modelling a linear regression (OLS) and tried using scaling techniques on the predictor variables. I could see the range of the variables change, however the prediction results remain the same. I would like to learn why scaling does not affect the prediction output but the coefficients. In addition the accuracy and model evaluation parameters remain the same before and after scaling.
-
What do you mean by accuracy and model prediction parameters? – Dave May 09 '23 at 08:17
-
Accuracy as in 'R2' and model evaluation metrics such ad RMSE etc – Patrick Priyadharshan May 09 '23 at 09:07
-
If the predictions do not change when you scale, shouldn’t all of those remain the same? – Dave May 09 '23 at 10:01
1 Answers
This is the result that you should expect because linear regression is scale-invariant. This means that scaling does not affect its results. Scaling the features would change the parameters, but would not change the results. This is easy to see, take a trivial linear regression model
$$ y = \beta_0 + \beta_1 x + \varepsilon $$
Now, if you scaled $x$ by dividing it by some constant $c$, to get exactly the same (optimal) result as previously, you would need just to have $\beta_1$ be $c$ times larger, so it becomes $(\beta_1 c) (x/c) = \beta_1 x$. The parameter estimates would adapt to scaling by increasing or decreasing accordingly.
In fact, this is the case for many machine learning algorithms as you can learn from threads like Which machine learning algorithms get affected by feature scaling? or other questions tagged as feature-scaling. Scaling could make the difference though if you used regularized regression, or if using other than the OLS algorithm to obtain the results, or for random effects regression, etc.
- 138,066