0

Given that:

  • I. The distribution of means of large enough samples converge to a normal distribution (by CLT).
  • II. The Standard Error of the Mean (SEM) of a sample approaches the Standard Deviation (SD) of the means as the sample size increases (also, SEM is low for larger samples).

Does that mean that if the SEM of a single sample is low I can assume there is 95% prob of the pop mean to be nearly between sample mean +- 1.96*SEM ?

Rational behind the question:

  • A random sample ${z}$ of largely enough size ${n}$ selected out of a population ${p}$ has a SEM of $\dfrac{std({z})}{\sqrt[2]{n}}$
  • Suppose SEM is very low, thus mean(${z}$) approaches mean(${p}$)
  • Let ${c}$ be a set constituted by the means of enough samples of size ${n}$ taken from ${p}$; thus ${c}$ is normally distributed (I.)
  • Remember that for a large enough sample SEM approaches ${std(c)}$ (II.)
  • Since as stated by CLT mean(${c}$) also approaches mean(${p}$), and mean(${p}$) is close to mean(${z}$) (low SEM) can we assume that there is approximately 95% probability that ${mean(p)}$ is equal to ${mean(z) +/- SEM*1.96}$ ?

E.g: Python dice simulation

Relates to Calculating the confidence interval for the mean value from a sample

Disclaimer: It is my first post in this forum so I deeply appreciate any feedback that might improve the quality of my next questions.

  • Is the second point true? The standard error of the mean in a sample of size $n$ is $\sigma/\sqrt{n}$ where $\sigma$ is the population standard deviation. – dipetkov Aug 07 '22 at 23:20
  • Is for SEM the population SD or the sample SD? I'm calculating SEM with the standard deviation of the sample (at least this is what John Guttag does in his book ) and assuming it approaches the standard deviation of the means (--> normal by CLT) – Gabriel Barberini Aug 08 '22 at 00:26
  • Your notation doesn't help to understand difficult concepts... A good place to start is the definition of a confidence interval. The correct interpretation of a 95% CI is not that there is 95% probability that the population parameter is inside the interval. But that if we were to repeat the same experiment/simulation many times, 95% of the CIs we construct will contain the population parameter. – dipetkov Aug 08 '22 at 10:01
  • To be fully honest I'm having difficulties to understand the difference between that and a probability. I'll look for some stat book to make sure I grasp the concepts the right way. Thanks for the help anyway. – Gabriel Barberini Aug 08 '22 at 11:08
  • It's not the most intuitive concept. One specific CI (such as sample_mean $\pm$ 1.96SEM) either contains or doesn't contain the population mean (this is a yes* or no event). When you have many CIs, you can think of the proportion of yes events as a probability. I like the explanation in this free online book: Improving Your Statistical Inferences. – dipetkov Aug 08 '22 at 11:20
  • Thanks a lot! I'll have a look – Gabriel Barberini Aug 08 '22 at 11:23

0 Answers0