How to scale interactions in regression (quantitative*qualitative)

Question

Supose I have two variables in a model, and their interaction, like this:

lmer(response~x1+x2+x1*x2+(1|time), data=db)

If x1 have a very big scale (like a city population, for example), probably I'm going to need to scale / center the variable. I know that if x1 and x2 are continuous, I can scale (or center) all predictors, and use scale(x1*x2) in the interaction term. But what if x2 is a categorical variable? Is it correct to use scale(x1)*x2? and how can I unscale it in both cases (one categorical, one continuous, and two continuous)?

Why not simply use response ~ x1 * x2 + (1 | time) as formula? Note that * does not denote the usual multiplication but relates to the Wilkinson-Rogers notations. For a factor, scaling is meaningless. — Yves, Apr 19 '21 at 12:18
@Yves because it's important in the analysis to know the effect of the interaction, so I can't drop it without an appropiate model diagnostic. The problem is that response and x1 have very different scale, and I had warnings like: Some predictor variables are on very different scales: consider rescaling, or for count models in glmmTMB, Model convergence problem: eigenvalue problems. So the first option is to scale, but not the factor variable, of course. — MKie45, Apr 19 '21 at 14:06
You have the same model with "my" formula because x1 * x2 is the same as x1 + x2 + x1:x2 where the 3-rd term is the interaction. Yes, it can be useful to scale the continuous variables, but you do not have to scale manuallly the interactions which are automatically coped with when building the design matrices. — Yves, Apr 19 '21 at 15:35
So, It's correct to use scale(x1)+x2+x1:x2, instead of scale(x1)+x2+scale(x1):x2? Or am I misunderstanding you? — MKie45, Apr 20 '21 at 12:43
Both for code clarity and efficiency I would scale the variables in the data frame df2 <- within(df1, x1 <- scale(x1)) then use the formula with no scale but the modified data frame. By chosing a round value for the scale as 10, 100, or 0.1, 0.01 the intepretation will remain simple. You can use x1 * x2 as the r.h.s. of the formula: this is the same thing as x1 + x2 + x1:x2. — Yves, Apr 20 '21 at 13:17

score 1 · Answer 1 · answered Apr 14 '22 at 15:22

Scaling predictors is necessary if (1) you need to have predictors on comparable scales, as with principal-component analysis or penalized methods like ridge regression or LASSO or (2) you will run into numerical problems as can occur with exponentiated predictor values in survival analysis. You don't seem to be in either situation.

You can lower the magnitudes of regression coefficients if you rescale a continuous predictor. For example, if you express a city population as a predictor in terms of millions of inhabitants then the magnitude of its regression coefficient will be $10^{-6}$ of what it would be if you used an unscaled population. But so long as you are consistent the ultimate results will be the same either way.

If you also center such a predictor that's involved in an interaction, be careful in interpreting your results as that will change the intercept and the apparent "main effects" of the predictors with which it interacts. Those are typically evaluated for situations when all predictors are at 0 or reference levels, so centering a predictor can change those other coefficients. Again, the ultimate results are the same provided that you are consistent.

There is no one-size-fits-all way to normalize categorical predictors. As you aren't using penalization you don't need to consider that for this application.

score 0 · Answer 2 · answered Apr 19 '21 at 07:02

0

I would suggest to scale all the variables in your model or none. And yes you can just scale x1 when x2 is a categorical variable. Because than you have response~x1+x2+x1*x2(Cat =1) + x1*x2(Cat=2) +x1*x2(Cat=3), and hence an interaction term for every category. And if you want to unscale the variables you have to add the mean and multiply by the standard deviation, since you scaled the variables to meand 0 and sd = 1. I hope I could help :)

answered Apr 19 '21 at 07:02

Elias

121

and what would happen to the intercept? like this but with an interaction term. – MKie45 Apr 19 '21 at 13:56

How to scale interactions in regression (quantitative*qualitative)

2 Answers2