1

I know similar questions were asked before. However, none of the existing answers help me with my problem:

I have a gls model with y=B0+B1X1+B2X2+B3X1X2+e. The VIF value for X1 and the interaction effect X1X2 are 413.67 and 414.08 respectively. The VIF value for X2 is only 1.044729. Furthermore, the resulting coefficients are very high for X1 and the ineteraction effect X1X2 but not statistically signficant.

It is written on many webistes that one does not have to consider multicollinearity between a variable and its interaction effect. But could it here lead to a lower p-value? Furhtermore, it seems strange that the VIF values for X1 and the interaction effect are so close to each other.

Help is much appreciated. Thanks in advance.

Dave
  • 11
  • 1

2 Answers2

2

I think that you might be getting misled by the p-values of the individual coefficients for X1 and the interaction term. Multicollinearity often isn't a big problem, particularly for predictive models. It just can make it hard to get precise estimates of individual coefficients.

When two predictors are highly correlated, the standard errors of their individual coefficients can be very large--that's the problem with multicollinearity. There will, however, typically be a compensating negative correlation between their coefficient estimates. There's a simple example here. That's not typically reported in standard model reports, but the coefficient variance-covariance matrix is an important component of model results.

If you do a test that evaluates the overall association of X1 with outcome, like a Wald "chunk" test on all coefficients involving it or a likelihood-ratio test between your model and one that completely omits X1 and its interactions, then you could still get a highly significant result. Similarly, for predictions from a model, the covariances among coefficient estimates can help correct for the high variances of the individual coefficient estimates and lead to reliable predictions.

Furthermore, with an interaction term, the individual coefficient for X1 is arbitrary: it depends on the centering of the X2 variable with which it interacts. When a predictor is involved in an interaction, it thus can be misleading to try to interpret the p-value of its individual coefficient.

Dave
  • 62,186
EdM
  • 92,183
  • 10
  • 92
  • 267
1

The advice you've heard that one does not have to consider multicollinearity in interactions probably comes from the fact that mean-centering your variables prior to computing the interaction term will sometimes (but not always) reduce multicollinearity between the interacting variables and the interaction term (assuming variables are symmetric). A common reason why centering fails to remove multicollinearity is when variables are skewed or have a high kurtosis. If you have not mean-centered your interaction terms, seeing some multicollinearity is not surprising. If you have, my guess would be that your $X_1$ is skewed.

Edit To answer the second part - centering will not influence the p value of the interaction or its effect size, but it does of course change the effect sizes and p value of the main-effects. So it depends on what you want to learn from the regression.

Code:

# simulate data set with an interaction
    d1=InteractionPoweR::generate_interaction(N = 100,
                                              r.x1.y = .2,
                                              r.x2.y = .3,
                                              r.x1x2.y = .2,
                                              r.x1.x2 = .1)
d1<span class="math-container">$x1b = d1$</span>x1 + 1
d1<span class="math-container">$x2b = d1$</span>x2 + 1
d1<span class="math-container">$x1x2b = d1$</span>x1b * d1$x2b

# centered main effects
summary(lm(y ~ x1 + x2 + x1x2,data=d1))

Call:
lm(formula = y ~ x1 + x2 + x1x2, data = d1)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.40212 -0.57251  0.03613  0.69612  1.81432 

Coefficients:
            Estimate Std. Error t value Pr(&gt;|t|)  
(Intercept) -0.02239    0.09449  -0.237   0.8132  
x1           0.22037    0.09492   2.322   0.0224 *
x2           0.18405    0.09527   1.932   0.0563 .
x1x2         0.22620    0.10012   2.259   0.0261 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.9397 on 96 degrees of freedom
Multiple R-squared:  0.1438,    Adjusted R-squared:  0.117 
F-statistic: 5.374 on 3 and 96 DF,  p-value: 0.001833


#un-centered main effects
summary(lm(y ~ x1b + x2b + x1x2b,data=d1))

Call:
lm(formula = y ~ x1b + x2b + x1x2b, data = d1)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.40212 -0.57251  0.03613  0.69612  1.81432 

Coefficients:
             Estimate Std. Error t value Pr(&gt;|t|)  
(Intercept) -0.200609   0.186233  -1.077   0.2841  
x1b         -0.005835   0.137800  -0.042   0.9663  
x2b         -0.042150   0.144004  -0.293   0.7704  
x1x2b        0.226200   0.100116   2.259   0.0261 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.9397 on 96 degrees of freedom
Multiple R-squared:  0.1438,    Adjusted R-squared:  0.117 
F-statistic: 5.374 on 3 and 96 DF,  p-value: 0.001833

David B
  • 1,532
  • According to https://statisticalhorizons.com/multicollinearity/ centering does not affect the results when measuring interaction effects. Thus, I didn't center the results. The data should not have a high kurtosis since I winsorized the data first. – Dave Feb 27 '23 at 17:13
  • @Dave see the edits. Centering doesn't effect the p-value or effect size of the main effect, but it does effect the p-value and effect size of the main effects in the context of an interaction (because of multi-collinearity). So, it depends what you want to learn. And it seems that your high VIF is because you didn't center. – David B Feb 27 '23 at 17:35
  • thank you for your answer. I mainly want to find out whether there is a significant interaction effect and whether it is positive or negative. Should I then just center the interaction effect and X1? Or if I only center X1 should I take the centered X1 or the normal X1 for the interaction effect? – Dave Feb 28 '23 at 10:12
  • Centering won't influence whether or not the interaction is significant or it's effect size. That being said, it can be easier to interpret an interaction when X1 and X2 are centered first (typically you don't center the interaction itself), as you can see from the example I gave in the answer. Otherwise, you will have to run the regression without the interaction as well, and reference those effect sizes when interpreting the interaction. – David B Feb 28 '23 at 14:56
  • 1
    Dave, centering before computing the interaction has profound effects on multicollinearity. Try it: compute your multicollinearity diagnostic values before and after standardizing variables that originally are located far from zero. See https://stats.stackexchange.com/a/34523/919 for the analysis and https://stats.stackexchange.com/a/384636/919 or https://stats.stackexchange.com/a/401678/919 for examples. – whuber Feb 28 '23 at 15:06