I have a paired data with n=50. I am using a paired t-test to test the difference in quiz scores before and after an interval. My data for quiz scores after the interval is showing as not normally distributed. Shapiro-Wilk normality test, p-value = 5.833e-06. However, I need a normal distribution for the paired t-test. I have tried to square transform the data but it is still not normally distributed, Shapiro-Wilk normality test showing p-value = 0.0003286. I have tried to subtract each score from the max score and +1, creating positively skewed data, and then performing a log transformation, but it is still not normally distributed, Shapiro-Wilk normality test showing p-value = 0.01641. I have read some sources online claiming that for some tests which assume normality, such as paired t-test, it is still okay to go ahead with non-normal data as it is related to the residuals and not the actual data? I do not understand this on a deep enough level to make an educated decision. I am looking for some advice on whether I should go ahead with the paired t-test regardless of the non-normal data, or perform a non-parametric test such as Wilcoxon instead?
Asked
Active
Viewed 96 times
0
-
1Welcome to CV. You definitely do not need anywhere near a Normal distribution for the paired t-test, because all that matters is the distribution of the differences; and even then, the test will be fine provided that distribution isn't terribly skewed. The next step, then, is to examine those differences. – whuber Apr 29 '23 at 18:01
-
With n=50 a permutations test will have virtually the same power as a t-test and does not care about normality. Why not use that instead? See, for example, here: https://stats.stackexchange.com/questions/64212/randomisation-permutation-test-for-paired-vectors-in-r – Michael Lew Apr 29 '23 at 21:24
-
1"My data for quiz scores after the interval is showing as not normally distributed." ... that's not relevant, since the derivation of the test wouled only assume normality of pair-differences under H0 (which is probably strictly false) and in large samples it's generally not going to need to be all that close anyway, at least if what you're mostly worried about is getting your chosen $\alpha$. Which is to say, even if the differences were non-normal it still may not be all that consequential, since they don't necessarily have to be normal under the alternative (where you likely are). – Glen_b Apr 30 '23 at 05:12