3

Wikipedia says now, here in the introduction:

http://en.wikipedia.org/wiki/Student%27s_t-distribution

"... then the t-distribution (for n-1) can be defined as the distribution of the location of the true mean, relative to the sample mean and divided by the sample standard deviation... In this way the t-distribution can be used to estimate how likely it is that the true mean lies in any given range."

Is this right? It seems not right to me. How can we have a distribution on the true mean after obtaining a sample, without some sort of Bayesian prior? I understand we can get a confidence interval for the true mean. But a distribution?

  • 1
    The use of the phrase 'relative to' there is critical to the meaning. Because of that, it doesn't refer to the population mean but to the difference between the population and sample mean. Beware, however - I see at least one error in that article. – Glen_b Oct 20 '13 at 06:22

1 Answers1

5

You don't have a distribution on the true mean, you have a distribution on the difference between the true mean and the sampled mean, and this difference is scaled by the sampled standard deviation (which is another separate random variable). The true mean is fixed.

Let $X \sim N(\mu,\sigma^2)$ such that $x_1...x_n$ constitutes an i.i.d. sample of size $N$ from $X$. Let $\bar{X}$ denote the sample mean and $S^2$ denote the sample variance. Then

$$\frac{\bar{X}-\mu}{\sqrt{S^2/N}} \sim t_{N-1}$$

The relevant section of the wikipedia article you linked is http://en.wikipedia.org/wiki/Student%27s_t-distribution#Derivation

David Marx
  • 7,127
  • Thanks. What about the bit that says "the t-distribution can be used to estimate how likely it is that the true mean lies in any given range." This makes it sound like we should be able to take a sample, then for any range [A,B] we choose, calculate Prob(A<mu<B). – user31703 Oct 20 '13 at 20:46
  • You can, but it usually goes in the other direction. The typical process is to pick a probability (usually 90%, 95%, or 99%) and determine the (symmetrical) range about the observed sample mean which encompasses this probability of finding the true mean. That's what a confidence interval is. – David Marx Oct 21 '13 at 06:24