I am a little bit confused about normality assumption in T-Test for difference of two independent means. My reasoning now as follows:
- If the distribution of the populations are non-normal, it's ok as long as the distribution of the sample means of both populations is normal for the given sample size.
- If the distribution of the sample means are non-normal for the given sample sizes, we still get a valid t-test if if the distribution of the sample means differences is approximately normal.
I've run tons of simulations and it looks like that even in situations where distribution of the sample mean is non-normal for a certain sample size (say 20000 observations per variant), I still get an approximately normal distribution of sample mean differences even with sample sizes smaller than 20000 observations. Since we perform calculations on normal distribution of sample mean differences, can't I assume that I still get a valid t-test?