1

Given that: $$ \text{Corr}(Y, X_1) > 0 \\ \text{Corr}(Y, X_2) = 0 \\ \text{Corr}(X_1, X_2) > 0 $$

Consider 2 regressions: $$ Y = a X_1 + \epsilon \\ Y = b_1 X_1 + b_2 X_2 + \epsilon $$

Which one is bigger, $a$ or $b_1$?

The answer should be $a < b_1$. Intuitively, I would answer this using "Regression by Successive Orthogonalization" in ESL Chapter 3. Basically, if we orthogonalize for getting $b_1$, the $z_p$ will be small because of $X_1$ correlated with $X_2$, so $b_1$ will be higher than $a$.

Here is a snapshot on the algo (it's on page 54): enter image description here

But can someone please help prove this in a more rigorous form? I was trying to come up with a representing beta (coefficient) using correlation to prove this, but it failed.

  • I have fixed the formatting in your question to what I believe you were asking and linked the book, as it wasn't clear in your question what ESL was referring to. If you feel there are inaccuracies in the notation, feel free to edit them. For example, I'm not sure what $zp$ here is, so it may help to clarify some of the notation if its not immediately clear to people. – Shawn Hemelstrand Nov 23 '23 at 02:12
  • 1
    Thanks Shawn!! I just added the snapshot of the algo and mentioned the corresponding page number. – xxxtttsss666 Nov 23 '23 at 03:21
  • There is a related post of mine based on this algorithm. You can check that if that helps – User1865345 Nov 23 '23 at 03:23
  • Unfortunately it's a bit different. – xxxtttsss666 Nov 25 '23 at 22:01

0 Answers0