3

Let $$y_i=B_0+B_1X_i+\varepsilon_i$$ where $\varepsilon_i\sim N(0,\sigma^2)$. Find the least squares estimator of $B_0$ and show that it is unbiased and has minimum variance.

I will not write in detail all the steps I went through, but $$\hat{B_1}=\frac{\sum((X_i-\overline{X})(Y_i-\overline{Y})}{\sum(X_i-\overline{X})^2}$$ and $$\hat{B_0}=\overline{Y}-\hat{B_1}\overline{X} \,.$$ Taking the expectation: $$\mathbb{E}[\hat{B_0}]=\mathbb{E}[\overline{Y}-\hat{B_1}\overline{X}]=\mathbb{E}[\overline{Y}]-\overline{X}\mathbb{E}[\hat{B_1}]=B_0+B_1\overline{X}-B_1\overline{X}=B_0$$ then this is unbiased.

But how can I show that the estimator has minimum variance in this case?

EDIT: Since I already proved that $B_0$ is unbiased, and since the distribution of $B_0$ belongs to an exponential family, it's a complete and sufficient statistic. Thus this estimator has minimum variance by the Lehmann–Scheffé theorem.

Chill2Macht
  • 6,249
  • 1
    Have you calculated the Cramer-Rao Bound for this? – Saket Choudhary Oct 03 '15 at 21:45
  • 1
    @rightskewed No, it does not seem an appropriate way, but maybe it is. –  Oct 03 '15 at 21:49
  • 1
    Cramer-Rao rule gives you a lower bound on the variance of the estimator. Think about the case when the equality holds – Saket Choudhary Oct 03 '15 at 21:53
  • 1
    @rightskewed Your point is that doesn't exist a unbiased estimator that attains the CRLB? –  Oct 03 '15 at 22:01
  • 1
    I think you can also use the Lehman-Scheffe theorem – JohnRos Oct 04 '15 at 10:13
  • OLS is the minimum variance unbiased estimator under the stated assumptions. You can show this by finding the Cramer Rap lower bound, perhaps other methods will tell you same. But CRLB is the way you would usually go about this – Repmat Jan 09 '17 at 17:27
  • Related: https://stats.stackexchange.com/questions/375098/showing-that-the-minimum-variance-estimator-is-the-ols-estimator?rq=1. – StubbornAtom Mar 22 '19 at 16:48

1 Answers1

2

For establishing a more general result, I am referring to the lecture notes here.

Suppose we have the multiple linear regression model

$$y=X\beta + \varepsilon$$

, where the design matrix $X$ (with non-random entries) of order $n\times k$ has full column rank and $\beta=(\beta_1,\ldots,\beta_k)^T$ is the vector of regression coefficients.

Further assume that $\varepsilon \sim N_n(0,\sigma^2 I_n)$ with $\sigma^2$ unknown, so that $y\sim N_n(X\beta,\sigma^2 I_n)$.

Under this setting, we know that the OLS estimator of $\beta$ is $$\hat\beta=(X^T X)^{-1}X^T y\sim N_k\left(\beta,\sigma^2(X^T X)^{-1}\right)$$

So, $$(y-X\hat\beta)^T X(\hat\beta-\beta)=(y^TX-y^T X)(\hat\beta-\beta)=0$$

Hence,

\begin{align} \left\| y-X\beta \right\|^2 &=\left\|y-X\hat\beta+X\hat\beta-X\beta\right\|^2 \\&=\left\|y-X\hat\beta\right\|^2+\left\|X\hat\beta-X\beta\right\|^2 \\&=\left\|y-X\hat\beta\right\|^2+\left\|X\hat\beta\right\|^2+\left\|X\beta\right\|^2-2\beta^T X^T y \end{align}

The pdf of $y$ now looks like

\begin{align} f(y;\beta,\sigma^2)&=\frac{1}{(2\pi\sigma^2)^{n/2}}\exp\left[-\frac{1}{2\sigma^2}\left\| y-X\beta \right\|^2\right] \\\\&=\frac{1}{(2\pi\sigma^2)^{n/2}}\exp\left[\frac{1}{\sigma^2}\beta^T X^T y-\frac{1}{2\sigma^2}\left(\left\|y-X\hat\beta\right\|^2+\left\|X\hat\beta\right\|^2\right)-\frac{1}{2\sigma^2}\left\|X\beta\right\|^2\right] \end{align}

Noting that $X\hat\beta$ is a function of $X^T y$ and that the above density is a member of the exponential family, a complete sufficient statistic for $(\beta,\sigma^2)$ is given by $$T=\left(X^Ty,\left\|y-X\hat\beta\right\|^2\right)$$

Again $\hat\beta$, a function of $X^T y$, is also a function of $T$ and it is unbiased for $\beta$.

So by Lehmann-Scheffe theorem $\hat\beta$ is the UMVUE of $\beta$.

StubbornAtom
  • 11,143
  • 1
  • 28
  • 84