0

I am a little bit confused about normality assumption in T-Test for difference of two independent means. My reasoning now as follows:

  1. If the distribution of the populations are non-normal, it's ok as long as the distribution of the sample means of both populations is normal for the given sample size.
  2. If the distribution of the sample means are non-normal for the given sample sizes, we still get a valid t-test if if the distribution of the sample means differences is approximately normal.

I've run tons of simulations and it looks like that even in situations where distribution of the sample mean is non-normal for a certain sample size (say 20000 observations per variant), I still get an approximately normal distribution of sample mean differences even with sample sizes smaller than 20000 observations. Since we perform calculations on normal distribution of sample mean differences, can't I assume that I still get a valid t-test?

Eugene Krall
  • 203
  • 1
  • 4
  • Unless the third standardized absolute population moment $E[|\frac{X-\mu}{\sigma}|^3]$ is really large, you will have pretty accurate significance levels with such a large sample. The issue (if there is one) will tend to be more to do with power (since a t-test may have relatively low efficiency for a distribution that's distinctly non-normal), but unless your sample was so large specifically to pick up a very small anticipated effect size, you're probably not lacking in power either in which case you may not care - you may well have the power you need as well. – Glen_b Sep 19 '23 at 04:59

0 Answers0