I am trying to determine the effectiveness of a learning intervention that we developed. We had two groups, a control and experimental group, where we applied the intervention to the experimental group. Each group took a pretest and posttest and we have now tabulated the scores.
Here's what we have so far about the scores:
- the pretest and posttest scores are not normally distributed
- we were left with unequal sample sizes for the control and experimental groups as some students unexpectedly dropped off from the activity
Based from what I have searched on the Internet so far, here's what I think we can do
- Check for the similarity of the pretest scores across the two groups using Mann-Whitney U Test
- Do the same for the posttest scores
- Determine significance of increase of the posttest scores from the pretest scores (gain score) using Wilcoxon sign test
I'd like to know
- If I got those information right, and
- If I want to know if the gain scores obtained in the experimental group is significantly better than the gain scores obtained by those in the control group, what would be an appropriate test for that?
EDIT:
- As suggested, I am adding a histogram and QQ plot of the residuals gain score (posttest-pretest score) for the test of normality (which I now understand applies to the residuals and not the data itself). I based the generation of the QQ plot from a couple of articles and YouTube videos, I hope I get it right.


I also just found a discussion on bootstrap, and it looks like I can use this to compare my control and treatment/experimental group. Thanks for your help. I'll try to update my post with my results once I'm done.
– MG B Feb 10 '21 at 05:19