1

`I am reading the book "Introduction to linear regression analysis" 5th by Douglas.

In the chapter 3 for the multiple linear regression for model $y = X\beta+\epsilon$, it computed $SSR = \hat{\beta}'X'y - \frac{(\sum_1^n{y_i})^2}{n}$ which I believe it is right. Here SSR is the regression sum of squares (i.e the variance from the model)

But just several pages later when discussing regression on subset, it claimed the $SSR = \hat{\beta}'X'y$ without any proof on the same model.

I am confused about the second equation, and believe it is false. But all late discussion is base on the second SSR equation. Can anyone explains what happens here?

Clarification:

If first comment below is true, then we know the first column of X is constant $(1,1,\cdots,1)$, so it indicates $\sum{y_i} = 0$, then under this condition the above two equations are the same, no trouble.

The question is still unsolved. Any help will be appreciated

lun
  • 19
  • 2
  • It looks like if $X$ is orthogonal to $y$--which can happen--then by definition $\hat\beta^\prime X^\prime y = \hat\beta^\prime(0) = 0$, making $SSR$ negative. That doesn't seem possible, so we ought to suspect a problem with your first formula. Are you sure you have transcribed it correctly? – whuber Oct 18 '16 at 22:55
  • Please register &/or merge your accounts (you can find information on how to do this in the My Account section of our [help]). Then you will be able to edit & comment on your own question. – Firebug Oct 19 '16 at 00:21

1 Answers1

0

The first one is actually SSR comparing to the model which has only intercept without regressors and it's mean y. That is actually SSR(beta|beta0).

The second one is SSR increased due to betas given no model or Y=0.

I would recommend you to check my answer for more information. https://stats.stackexchange.com/a/361086/212274

KDG
  • 853
  • 7
  • 19