3

I have the test scores of two groups, say A, and B. And the former consists of 186 individuals whereas the latter only has 100. The test scores range from 1 to 12, and because group A has more people, it obviously has a total higher score than group B. My question is how can I standardize the test scores so that I can compare them? I thought about dividing group A's scores by 186 and group B's by 100...would that work?

Edit:

The scores are divided into 7 groups, including agitation, irritability, sleep, etc. Here is an example of one.

    >table(status, sleep_scores)
    status   0   1   2   3   4   6   8   9
     0     170   3   3   7   0   1   1   1
     1      82   5   3   6   3   1   0   0

Status is a binary variable, 0 meaning that the participant does not have the disease, and 1 otherwise. So in the table above, 170 of the normal controls (aka people who do not have the disease) score a 0 on the test (the lower the score, the better the performance). 3 people score a 1, where as in the disease group, 5 people scored 1. So I have 7 sets of these scores, and they are pretty similar to one another (most people score 0's, only a few score > 0). So I would like to compare which group is doing worse? And in which category are they doing worse in? Sleep, agitation, etc?

Adrian
  • 2,869
  • 5
  • 32
  • 53

1 Answers1

1

That form of standardization would be a comparison of means ... and this is a very common thing to do; however the scores need to be something for which an average will carry meaning.

If you're interesting in seeing which is bigger on average (or whether it's plausible that both are random samples from the same population distributions with the same mean), then this standardization would (likely) not only be reasonable, but pretty much necessary.

So it's probably a good place to start.

If you're interested in comparing some other statistic - or even the whole distribution, rather than just the means, you might do something else.

If you just want to make some visual comparison, there are a number of suitable choices you might consider, but since the values are integers over a small range, you would probably find something like a histogram suitable, though there are other choices.

If you can explain more about what you're trying to achieve we may be able to say more.

Glen_b
  • 282,281
  • hmm, I see. I edited my main post with a little more information. Would you take a look at that? – Adrian Aug 03 '14 at 17:04
  • The information in your edit would substantially alter my advice, since the scores don't sound like they're interval scaled. Indeed some parts of your discussion make them sound like they may not even be fully ordinal. Please provide more information about the scale in your post. – Glen_b Aug 03 '14 at 22:57
  • The scores are integers ranging from 0 to 12. What exactly do you mean by scale? – Adrian Aug 04 '14 at 00:03
  • What do the scores mean? Is the distance between 0 and 1 really the same as the distance between 3 and 4 or 8 and 9? See also here. – Glen_b Aug 04 '14 at 00:05
  • Thanks for the link. From what I've read I think the scores are ordinal. – Adrian Aug 04 '14 at 03:56
  • If they are ordinal, then it probably doesn't really make sense to compare means, but that depends on how the scores are generated. If they are sums of component parts (like Likert scales composed of several questions, say) then they were already assumed to be interval. – Glen_b Aug 04 '14 at 08:19