I am running vif(model) to identify multicollinearity between predictor variables in the model. Is there a rule of thumb on the gvif value that indicates presence of multicollinear variables?
1 Answers
John Fox talks about the variance inflation factor (VIF) and generalized variance inflation factor (GVIF) in his book (2016). He did not give an explicit rule of thumb but argues that
... the linear relationship among $X$s must be very strong before collinearity seriously impairs the precision of estimation: It is not until $R_j$ approaches .9 that the precision of estimation is halved (p. 342)
Considering that $\text{VIF} = 1/(1-R_j^2)$, we may decide on a cut-off value (but there are other things to consider, see my answer to a related question).
Now, Fox recommends using $\sqrt{\text{VIF}}$ instead of VIF, because "the precision of estimation of $B_j$ [slope coefficient] is most naturally expressed as the width of the confidence interval for this parameter, and because the width of the confidence interval is proportional to the standard deviation of $B_j$ (not its variance)" (p. 342).
In a few pages, Fox discusses GVIF and suggests reporting $\text{GVIF}^\frac{1}{2df}$ where df is the number of coefficients, which is analogous to reporting $\sqrt{\text{VIF}}$ (p. 358), and makes GVIF comparable across dimensions (p.460). By the way, Fox visited Cross Validated to answer a question on GVIF (here).
So, I would like to argue that whatever cut-off value you decide on for $\sqrt{\text{VIF}}$ applies to $\text{GVIF}^\frac{1}{2df}$. But again, whether that cut-off value is appropriate or not, is open to discussion.
Fox, John. 2016. Applied Regression Analysis and Generalized Linear Models. 3rd ed. Los Angeles: Sage Publications.
- 2,332