Which statistical test to use with repeated measures & clustered observations?

Question

I am doing an experimental study for my research. I have a class with 24 students and six groups, where each group includes 4 students. I want to compare the performance of each group in two cases: first, the performance of groups when using my prototype, and second, the performance of groups without using my prototype. I used a team satisfaction questionnaire.

I am wondering which statistical test is suitable in my case because I have just 24 data.

Mann-Whitney $U$-test
paired $t$-test
Wilcoxon test

Welcome to CV! I made a few edits to your question to clarify it. You can use the widgets at the top of the compose box to help with formatting - the way you use line breaks has no effect except if there are 2 line breaks in a row — Russ Lenth, Oct 25 '14 at 18:00
I don't understand the prototype part of this. Do you have 3 groups with your prototype and 3 without, or do you have 2 measures on each subject, or what? — Russ Lenth, Oct 25 '14 at 18:03
Your question isn't totally clear. I made some guesses & edited it for clarity. Please ensure that it represents you situation accurately. Do you have 2 questionnaire scores per student as your response (Y) value? Is your goal to see if using the prototype improves team satisfaction? Did everybody use the prototype 1st & no prototype the 2nd time? — gung - Reinstate Monica, Oct 25 '14 at 18:23
Do you have 2 questionnaire scores per student as your response (Y) value? yes Did everybody use the prototype 1st & no prototype the 2nd time ? yes — flower, Oct 25 '14 at 18:28
my friend told me i can not use paired t-test because my sample is less than 30 ? is that true? — flower, Oct 25 '14 at 18:31
You can use a paired $t$ test for any sample size, as long as it is appropriate. So your friend is confused about the rules for using them. However, I still don't understand the situation well enough to know whether it is appropriate here. — Russ Lenth, Oct 25 '14 at 22:08

score 2 · Answer 1 · answered Oct 28 '14 at 02:17

None of the tests you list will be valid. The reason is that your data are not independent, and all of those tests require independence. That is, the ratings within the same group will be correlated with each other. As a result, you need to use some form of mixed effects / multilevel model. The fact that your $N < 30$ is not the key issue. If the population from which they were drawn was normal, you could use a linear mixed effects model. The Mann-Whitney U-test and the Wilcoxon are appropriate for smaller samples from non-normal populations (MW is for unpaired, and Wilcoxon is for paired so you would use the latter), but they don't address the within group clustering. If you aren't willing to make the assumption of normality (which seems prudent, given that these are ratings), you would probably do best to use a mixed-effects ordinal logistic regression model. I don't know what software you use, but in R this can be done using the ordinal package. These are somewhat advanced models to fit; if you aren't familiar with them, you should work with a statistical consultant.

On an unrelated note, be aware that your prototype is confounded with order. If you come to conclude that the before and after differ, you cannot tell whether that is because of the prototype or because these things change over time unfortunately.

Which statistical test to use with repeated measures & clustered observations?

1 Answers1

Linked