I have a set of scores with a maximum range from -12 to +12 for a set of categories of a variable. There are two populations that each have a set of scores for the same categories of the same variable. The sets of scores look something like this:
| Category | Population One | Population Two |
|---|---|---|
| Red | +7 | -3 |
| Orange | +1 | -2 |
| Yellow | 0 | +5 |
| Purple | +2 | -1 |
| Green | -10 | +1 |
The scores between the two populations are very different for each category (i.e. Green has a strong negative association for Pop One, and a positive association for Pop Two). Additionally, the categories also have clearly different effects on the population association (i.e., Green scores very negatively compared to Red in Population One).
Is there a way to statistically “prove” whether the differences are significant?
Is there then a way to "prove" which pairs are the most significantly different (aka, Green is the most different between Pop One and Two)?
Note that the scores for a given population will always sum to zero or near zero, so comparing the averages will be insignificant.
Is it justifiable to report significance based on the standard deviation of each population? i.e. "The score for Green in Population One is two standard deviations into the negative, so it is the most significant score." OR "Any score below one standard deviation away from 0 was not considered significant"
If it is justifiable to use this, should I calculate the standard deviation using the median rather than the mean, since the mean will be zero or near zero?
It is feasible to get the score for each individual in each population, and then have the "population score" be the mean of the individuals. Would this make it easier to run statistical tests on significance?
Thank you in advance for any advice!