4

I have a set of final exam grades for an entire year of students, and I need to calculate quintiles from them. How should I go about it?

Also, is the range from the arithmetic average and the top of the set smaller than the range from the third quintile to the fifth quintile?

Glen_b
  • 282,281
ppp
  • 143
  • I'm not sure I understand the last paragraph. Do you mean the range of the scores within quantiles? – Patrick Coulombe Oct 25 '15 at 16:17
  • @PatrickCoulombe I mean that, if we put all our grades in a histogram, and we split that histogram in quintiles and in the rightmost-to-average and leftmost-to-average, will it always be that the the third, fourth and fifth quintile will have more members than the rightmost-to-average set? Sorry for my english (and for my almost ignorance of statistics in general). – ppp Oct 25 '15 at 16:30
  • 3
  • Each quintile will have the same number of members (assuming the total population is a multiple of 5). 2. The arithmetic average could be in any quintile. Unless you have a known distribution and a large sample size, you cannot know in advance into which quintile the arithmetic mean will fall. For example, say you have a small class where most students did extremely well and one student scored a zero. It is quite possible that your average will fall in the bottom quintile.
  • – C8H10N4O2 Oct 25 '15 at 20:09
  • @C8H10N4O2 what if it's more of a log-normal distribution? – ppp Oct 25 '15 at 20:17
  • 3
    Strictly the quintiles are values such that 20, 40, 60, 80% of values are smaller. It is common to extend the term to intervals those values define. Even if the number of values is a multiple of 5, you need a convention on how the quintiles are defined, for a group of 100, that might be use the average of the 20th and 21st smallest, etc. For sample size not a multiple of 5, you need a convention even more, and several have been suggested. Good statistical software will always have a dedicated command, function or routine but it might be under some name like quantile or percentile or centile. – Nick Cox Oct 25 '15 at 23:29
  • @PatoSáinz The quick solution is to rank your exam scores and group them into 5 buckets. If you've indicated how your final exam is graded, I missed it. Are they scaled from 0 to 100? Accounting for ties and bunching around a few values, I don't think it's likely that you will derive equally sized quintiles. For instance if you have a bunch of exams clustered around 84, 85 and 86, would you put them in separate buckets? Also, your question about the arithmetic mean is better answered by leveraging the median not the mean, since the median and the mean agree only if scores are bell-shaped – user78229 Oct 26 '15 at 11:41
  • @PatoSáinz Also, good joke with your CV handle! – user78229 Oct 26 '15 at 11:44
  • @DJohnson Good advice generally, but you're indulging the common confusion between quintiles and the intervals they define. Also, at this level "size" can be ambiguous, as between the frequency of values in an interval and its width. So, one ambiguity can feed another. – Nick Cox Oct 26 '15 at 11:45
  • @NickCox Interesting comment. Would you say more about the "common confusion?" – user78229 Oct 26 '15 at 11:47
  • The point was made in my previous comment. The first quintile is defined by 20% of values being lower (lots of small print about the precise rules). Some then take the interval with that as upper limit as being also the first quintile. In some fields it is worse: people want to classify values in quintile bins or groups, which are then labelled by their upper limits, i.e. the individual values are thrown away in subsequent analysis. – Nick Cox Oct 26 '15 at 12:43
  • @NickCox There is no shortage of dumb stuff that people do with data, e.g. and to your point, throwing away individual values in favor of retaining quintile assignments. As analysts, there's only so much stat policing that can be indulged or, for that matter, that the "great unwashed" will sustain. My point was really about relaxing strictly assigned boundaries -- however defined with all the caveats and nuances noted in this thread -- that would put "bunched" scores (84s, 85s and 86s) together into separate quintiles. – user78229 Oct 26 '15 at 18:27
  • 1
    @DJohnson I think we do agree. My own view is that if quantiles are also tied values (e.g. 42 is a quantile and there are several values of 42), then any quantile-based binning must assign all 42s to the same bin even if the price is now unequal numbers in bins that "should" have equal numbers. People using my favourite software are often puzzled by this and don't see that the alternative of assigning some 42s to one bin and the others to another is quite arbitrary, especially for comparing with other variables. – Nick Cox Oct 26 '15 at 18:33