3

I'm using the ols_eigen_cindex function to assess multicollinearity. With these variance proportions:

model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)

ols_eigen_cindex(model)

Eigenvalue Condition Index intercept disp hp wt qsec 1 4.721487187 1.000000 0.000123237 0.001132468 0.001413094 0.0005253393 0.0001277169 2 0.216562203 4.669260 0.002617424 0.036811051 0.027751289 0.0002096014 0.0046789491 3 0.050416837 9.677242 0.001656551 0.120881424 0.392366164 0.0377028008 0.0001952599 4 0.010104757 21.616057 0.025805998 0.777260487 0.059594623 0.7017528428 0.0024577686 5 0.001429017 57.480524 0.969796790 0.063914571 0.518874831 0.2598094157 0.9925403056

what does it mean to have a a high variance for the intercept and qsec in dimension 5? Is it a problem? Or should I only look for high values among the predictors, excluding the intercept?

locus
  • 1,593
  • 4
    Many people forget to center their variables before analyzing multicollinearity. Unless you perform that preliminary step, the eigenvalues are virtually meaningless. See Belsley, Kuh, & Welsch Regression Diagnostics for a full account of this. – whuber Nov 13 '23 at 13:16

1 Answers1

5

This situation is often caused by a variable that is far from 0 and can often be alleviated by centering it prior to analysis. This works well in your case:

install.packages("olsrr")
library(olsrr)

model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars) mean(mtcars$qsec)

ols_eigen_cindex(model)

mtcars$qseccent <- mtcars$qsec - mean(mtcars$qsec) model2 <- lm(mpg ~ disp + hp + wt + qseccent, data = mtcars) ols_eigen_cindex(model2)

and now, the highest CI is 21.68 with no high variance proportions.

Whether this is necessary is another matter.

Peter Flom
  • 119,535
  • 36
  • 175
  • 383
  • Thanks @PeterFlom, that's helpful to know. However, let's say all variables are centered, and both the Intercept and another variable are still showing a high variance proportion. What would the interpretation be? Should I only consider high variance proportions for variables, excluding the intercept? Or will I need to perhaps remove the variable that has a high variance proportion (removing the intercept would probably be a bad idea right?)? – locus Dec 07 '23 at 00:27
  • I don't think that that is possible. (i.e. to have a centered intercept and a variable showing high variance proportions. Have you had this happen? – Peter Flom Dec 07 '23 at 10:53
  • 1
    No, not really. Just wondering what I could do if that were the case. Thanks for your answer! – locus Dec 07 '23 at 22:46