4

In wikipedia it says

Multicollinearity does not reduce the predictive power or reliability of the model 
as a whole, at least within the sample data set; it only affects calculations 
regarding individual predictors.

Does the phrase "Multicollinearity does not reduce the predictive power" mean that multicollinearity does not change the predictive power when one knows the regression coefficients?

At the same page it is written:

So long as the underlying specification is correct, multicollinearity does not 
actually bias results; it just produces large standard errors in the related 
independent variables. More importantly, the usual use of regression is to take 
coefficients from the model and then apply them to other data.  Since 
multicollinearity causes imprecise estimates of coefficient values, the
resulting out-of-sample predictions will also be imprecise. 

First of all, it seems to me that these statements are contradictive. How comes that, it says that multicollinearity does not change predictive power and at the same time it says about large errors of coefficients, which lead to overfitting?

Second, what does it mean "large standard errors in the related independent variables"? Is this about "the errors of coefficients"?

ABK
  • 506
  • 1
    It means large standard errors on the parameter estimates of correlated predictors. – Dave Jul 03 '20 at 14:36
  • Dear Dave, but the statement is cot correct. Independent variables are features, not the coefficients. – ABK Jul 03 '20 at 14:38
  • 2
    Please pay attention to Dave's use of parameter estimates in his comment. Could you explain what you think is a contradiction in the Wikipedia quotations? I cannot discern any. – whuber Jul 03 '20 at 14:43
  • 1
    The language leaves something to be desired, but they mean the standard errors on the parameters of the features. – Dave Jul 03 '20 at 14:44
  • @whuber, I have modified the question. The first statement "Multicollinearity does not reduce the predictive power... ". Is this for the case when the coefficients are known? – ABK Jul 03 '20 at 14:48
  • In what sense can one "know" the regression coefficients? The quotations concern situations where those coefficients must be estimated from data; if somehow the coefficients are established by theory, then they are what they are (and multicollinearity of data is irrelevant because the data are irrelevant). – whuber Jul 03 '20 at 14:58
  • Dear @whuber, I have edited the question – ABK Jul 06 '20 at 09:09

0 Answers0