1

EDITED TO POSE DIFFERENT QUESTIONS THAN DUPLICATE:

this question poses the question:

Can standardized coefficients become greater than |1|? If yes, what does that mean and should they be excluded from the model? If yes, why?

The accepted answer can be summarized as:

Standardized coefficients can be greater than 1.00... They are a sign that you have some pretty serious collinearity.

The second answer there nitpicks on the meaning of standardization, and states that:

(when you fit a) regression (with only a single predictor) then it is mathematically impossible to see a coefficient outside of the -1 to 1 range since the slope will be the same as the correlation)

The two answers do not agree upon what to do with such coefficients, the first says:

Whether they should be excluded depends on why they happened - but probably not.

While the second:

some sources suggest dropping those variables.

To summarize both answers: Yes, it can happen - only with collinearity, never in simple regression.


Given the above, my questions are:

(1) Is multicollinearity the only situation in which the standardized coefficient will be greater than +- 1?

(2) Why can't a standardized coefficient in simple regression exede +-1? I mean, what is the logical explenation here? From my own (very non-mathematically oriented) point of view, why can't 1 standard deviation in one variable contribute more that one standard deviation to a second variable? Suppose we are predicting wages (y) by school years (x), and wages has a points at +6 s.d (people who make a lot of money), yet school years has only 3 s.d of data. Than why cant the standardized coefficient for school years be greater than 1?

Pukalu
  • 45
  • 1
    The wikipedia article deals with multiple regression and is used to put two or more predictor variables in the same units (# of standard deviations. In univariate regression you don't need to do this. I think the wikipedia article is just saying that the univariate analogue would by the correlation coefficientt which is a scaled version of the slope parameter. – Michael R. Chernick Dec 16 '16 at 13:28
  • 2
    I think the answer given in the referenced CV post is correct and your professor is wrong. Collinearity could come in if X and X squared are used in the model. But I think that goes off on a tangent. I think the issue really goes away for univariate regression. If you are interested in multiple regression your point that an increase or decrease of standard deviation for one of the predictor variables can increase the dependent variable by more than one standard deviation is well-taken. – Michael R. Chernick Dec 16 '16 at 13:38
  • It can be a problem when answers appear to contradict each other. In this case, the votes on the answers in the duplicate will help you determine which one is likely correct--especially because the upvoted answer explicitly addresses all the issues in your question. – whuber Dec 17 '16 at 16:25
  • But it does not explain if situation other than collinearity can cause a coefficient to be greater than 1 and also doesn't explain, simply, why it cannot happen in simple regression (see my second question at the bottom of the page) – Pukalu Dec 17 '16 at 16:28
  • The accepted answer clearly states that when a standardized coefficient exceeds $1$ in size, it is "a sign that you have some pretty serious collinearity." A fortiori this cannot happen in simple regression, because collinearity is impossible (you need a second regressor). – whuber Dec 17 '16 at 17:35
  • @whuber I am sorry, perhaps I am just missing something here, but my questions are (1) is collinearity the ONLY situation that can cause std.coef to be greater than 1? --this is not clear by the answers. And (2) WHY can't this LOGICALLY happen in simple regression... – Pukalu Dec 17 '16 at 17:59
  • The accepted answer, which I quoted, is unambiguous about (1). Answers to (2) appear in many places here: search for correlation regression. – whuber Dec 17 '16 at 18:55

0 Answers0