I apologize if my questions are too naive, I'm just approaching the subject.
Let's say that I have a population following a probability distribution with mean $\mu$ and standard deviation $\sigma$. I want to estimate $\mu$. I therefore take a sample $X_1 ,...,X_n $ from the population and compute the average $\overline X$ of my observations. The central limit theorem says (roughly speaking) that, for $n$ sufficiently large, $\overline X$ follows a normal distribution with mean $\mu$ and standard deviation $\frac{\sigma}{\sqrt n}$. Thus, with 95% confidence, my observed average $\hat \mu$ is in the interval $(\mu - 1.95 \frac{\sigma}{\sqrt n} , \mu + 1.95 \frac{\sigma}{\sqrt n})$. This allows me to estimate $\mu$ with 95% confidence, if I know the true standard deviation, and why should I?.
I don't know $\sigma$, so I need to estimate it from the sample. I could take the true standard deviation $\hat \sigma$ of the sample, or the corrected one $ \sqrt{\frac{n}{n-1}} \hat \sigma$ (I know it's not an unbiased estimator for $\sigma$, but still its bias is smaller than the bias of $\hat \sigma$).
Here is my two questions:
- What is the correct estimator for $\sigma$ to use? Why do people seem to use $\hat \sigma$ instead of the corrected version (this might not be true, I'm a newbie)?
- In estimating $\mu$, we used an estimate for $\sigma$,let's call it $s$ (be it the uncorrected or the corrected one). Therefore, the interval $(\mu - 1.95 \frac{s}{\sqrt n} , \mu + 1.95 \frac{s}{\sqrt n})$ does not truly give me a 95% confidence estimate of $\mu$ because we are not accounting for the error in estimating $\sigma$. What is the true level of confidence?