My Data: Respondents were asked to evaluate the quality of two products on a scale of 0-10. There were 12 criteria that constituted the grading scheme, and I would like to analyze the scores overall (as an average of the scores across all the criteria) and per criterion.
Each respondent assessed both products. There were 25 respondents in total, meaning 25 scores for each of the two products.
Objective: To see if one product is better than the other.
Possibilities: The statistical test should be one that finds group differences while accounting for paired data since the scores for each product are linked by the respondent (because the respondents in both cases are the same). My data is definitely non-normal.
I initially considered Wilcox signed rank test, although I could not derive p-values (which my collaborators would prefer) due to ties.
I then considered permutation-based and, after some reading, I am also wondering whether K-S test might be appropriate for my goals.
I have experimented using both the median and the mean difference as the test statistic for my permutation test, and whilst most of the p-values are comparable between the two approaches (I guess around two-thirds), some are vastly different (changing from between 0.1 - 0.5 in size).
My objective is simply to know if one product is better than the other. Since all of these approaches compared group means but in different ways, I am not sure which would be preferred in my circumstances. I think presenting the results of multiple tests gives a fairer all-round picture, but in order to do that I will also have to understand how the differences in the conclusions of the tests arise, i.e., what causes them.
Questions
Does one test - either one I've mentioned or some other one - make more sense than the others in my circumstances?
If I opt for the permutation test, does either the median or the mean make more sense in my scenario?
I will probably end up presenting both and either use one as the main statistic with the other as a sensitivity analysis or simply present both and discuss the results of both statistics combined. In this case, though, I have to make sure I can describe the results adequately. Hence:
a) If there are large differences between the median and the mean when I use the permutation test, how can I understand what these differences tell me?
b) I know that it's evidence that the median and mean are at different locations but what is the deeper meaning of this in relation to my objectives and how does it influence how I transmit the results?