Why use Chi-squared distribution to estimate the variance of normal distribution?

Question

It is well known that one can use $\chi^2$ distribution to estimate the variance of normal distribution (i.e., $N(0,\sigma^2$)).

Is there a particular reason of using $\chi^2$ distribution? Or can we say that it is the "most accurate" way to estimate the variance of a normal distribution? Or does this "most accurate" thing exist? If it exists and it is not $\chi^2$ distribution, then what it is?

Thanks!

Because it relates to the sampling distribution of the sample variance in a normal sample. — Glen_b, Jun 02 '14 at 14:03
Could you please explain in what sense one might "use" a distribution as an estimator? In the conventional technical sense of these words, such a phrase would not be meaningful. — whuber, Jun 02 '14 at 15:48

score 5 · Answer 1 · answered Jun 02 '14 at 15:34

I'd say that it is not well known that one can use $\chi^2$ distribution to estimate the variance of a sample from a $\mathcal{N}(0,\sigma^2)$ distribution. It is well-known that if $X_1,\dots,X_n$ is a random sample from a $\mathcal{N}(\mu,\sigma^2)$ distribution:

if $\mu$ is not known, then $(n-1)S^2_n/\sigma^2$ has a $\chi^2$ distribution with $n-1$ degrees of freedom, where $S^2_n=\frac{1}{n-1}\sum_{i=1}^n(X_i-\bar{X})^2$ is the sample variance and $\bar{X}$ is the sample mean;
if $\mu$ is known, e.g. if one knows that $\mu=0$, then $nS^2_0/\sigma^2$ has a $\chi^2$ distribution with $n$ degrees of freedom, where $S^2_0=\frac{1}{n}(X_i-\mu)^2$;

and that these facts can be used to show the $S^2_n$ (mean unknown) and $S^2_0$ (mean known) are unbiased estimators of $\sigma^2$.

If $\mu$ is known, then $$Z_i=\frac{X_i-\mu}{\sqrt{\sigma^2}}\sim\mathcal{N}(0,1)$$ and $$\sum_{i=1}^n Z_i^2=\sum_{i=1}^n\frac{(X_i-\mu)^2}{\sigma^2}= \frac{n\left(\frac{1}{n}\sum_{i=1}^n(X_i-\mu)^2\right)}{\sigma^2}=\frac{nS^2_0}{\sigma^2}\sim\chi^2_n $$ by the definition of the $\chi^2$ distribution.

Since $\chi^2_n=\text{Gamma}\left(\frac{n}{2},\frac{1}{n}\right)$ and if $Y\sim\text{Gamma}(\nu,\lambda)$ and $a>0$ then $aY\sim\text{Gamma}\left(\nu,\frac{\lambda}{a}\right)$, $$S^2_0=\frac{\sigma^2}{n}\left(\frac{nS^2_0}{\sigma^2}\right)\sim \text{Gamma}\left(\frac{n}{2},\frac{n}{2\sigma^2}\right) $$ Hence $E[S^2_0]=\frac{n/2}{n/2\sigma^2}=\sigma^2$, i.e. $S^2_0$ is an unbiased estimator.

The proof when $\mu$ is unknown is similar but a bit cumbersome because one must show that $\bar{X}$ and $S^2_n$ are independent random variables (see Casella and Berger, Statistical Inference, Theorem 5.3.1).

score 2 · Answer 2 · answered Jun 02 '14 at 12:53

2

Recall the definition of the $\chi^2$-distribution with $n$ degress of freedom. If $X_i$ are $iid$ standard normal then $$ Z = \sum_{i=1}^n X_i^2 $$ has a $\chi^2$-distribution with $n$ degrees of freedom. Thus it is natural to apply this distribution for the sample estimator of variance if we assume a normal distribution.

answered Jun 02 '14 at 12:53

Richi W

3,436

1

Thanks! However, what i really want to know is that, if $\chi^2$-distribution offers the best estimation? – user48855 Jun 02 '14 at 13:09

score 0 · Answer 3 · answered Apr 04 '18 at 04:12

The other responses adequately explain the relationship between the sample variance of a normal distribution and $\chi^2-$distributions. However, from your latest comment, it seems like you're interested in the strength of the estimator.

To address this, I'll point out two things:

$\chi^2$-distribution doesn't provide an estimator per say. What's happening is that the sample variance is $\chi^2$-distributed.
Now that we know the distribution, we can do with it what density functions do best, which is to estimate confidence intervals. For example, you can show that given some data, the probability of your sample variance being correct with 95% probability is:

Why use Chi-squared distribution to estimate the variance of normal distribution?

3 Answers3