1

There is a question from a statistics book:

If four observations are taken from a normal distribution what is the probability that the difference between the observed mean $\bar x$ and the true mean $\mu$ will be less than five times the observed standard deviation, $s$?

The book gives an answer "About 0.002 since the chance that a $t$-variate with 3 d.f. will exceed 10.21 is 0.001". The reason they use 10.21 is because thats just what the nearest discretisation of the t exceeding probabilities table in the book is. I cannot figure out though, why they arrive at around the 10 to look up the probability exceeding and why they give the answer 0.002, because the question states that "...will be less than five times the observed standard deviation, s?" so surely the probability should be substantial because it will cover the bulk of the mass in the center of the t-distribution curve. Is the book wrong or am I severely overlooking something?

Also, first time posting on this forum, so I may not have the text editing right. Apologies in advance.

d.singh
  • 13
  • 3
  • Okay, I may as well get further clarification here. What do you mean by adjusted? Also, is $s$ in the $t$-statistic calculated from just you sample of n (for n-1 dof t-distribution), or is it meant to be calculated from the entirety of the population you have available from which you are drawing n samples at a time? – d.singh Sep 20 '15 at 03:52
  • sorry, I turned that comment into an answer – Cliff AB Sep 20 '15 at 03:54

1 Answers1

1

There seems a problem with this answer. If you're sure you've looked up everything correctly, I would email the author.

How the problem should be addressed:

As it is written here, I'm going to assume that we are interested in $P(\frac{\bar x - \mu}{s} < 5)$, (not $P(\frac{|\bar x -\mu|}{s } > 5)$ as the book apparently answers). Note that this isn't even quite right, as $s$, the square root of the unbiased estimator of the variance, is not the observed standard deviation. See the bottom for more details.

As such, we use the fact that we know that $\frac{\bar x - \mu}{s/\sqrt{n}} \sim t_{n-1}$. Because $n = 4$, we then know that

$P(\frac{\bar x - \mu}{s} < 5) = $

$P(\frac{\bar x - \mu}{s } \times 2 < 5 \times 2) = $

$P(\frac{\bar x - \mu}{s/2} < 10) = $

Since $n = 4$, we then know that the above is equal to

$P(t_{3} < 10)$

I think you've got it from there.

Now that we know how the problem should have been addressed, what was my complaint about $s$ and observed standard deviations? The observed standard deviation is, by definition, $\hat \sigma = \sqrt{\sum_{i = 1}^n \frac{(x_i - \bar x)^2}{n} }$

However, since $\hat \sigma^2$ is a biased estimator of $\sigma$, we often use

$s = \sqrt{\sum_{i = 1}^n \frac{(x_i - \bar x)^2}{n-1} }$

as $s^2$ is unbiased for $\sigma^2$. But note that this is not the observed standard deviation!

Cliff AB
  • 20,980
  • Thanks for that. I was considering $P(-t^* < \frac{\bar x - \mu}{\frac{s}{\sqrt n}} < t^)$ where $t^$ is $5s$ which I reasoned is equivalent to $P(-5 < \frac{\bar x - \mu}{\frac{1}{\sqrt n}} < 5)$ since any change in $s$ will proportionally affect the $t$-statistic in the inequality and the value $5s$ (or $t^*$). Is that a sound statement to make? – d.singh Sep 20 '15 at 04:14
  • ah, I made mistake in leaving out the $\sqrt n$. That explains my mix up of 5 vs 10. Fixing that... – Cliff AB Sep 20 '15 at 04:25