0

The sum of the mean and standard deviation of a non-normal distribution can exceed the value of the largest sample. For a good explanation of why, see Can mean plus one standard deviation exceed maximum value?.

My related "going-the-other-way" question is, if a set of 15,000 samples, all ages between 9 and 20 for a certain criteria, whose mean is plainly no smaller than 9, is claimed to have a standard deviation of 16, what can one conclude about the distribution of those samples?

This is not a hypothetical example; it's taken from a paper published in the 1990s that I'm trying to understand and possibly find flaws in.

jsbox
  • 101
  • 6
    If the range of the random variable is 20 - 9 = 11, then it's impossible to have a standard deviation of 16 which is bigger than the range. – Amin Shn Jan 31 '23 at 19:32
  • 3
    Is the standard deviation really of age, or is it of some other variable? As @AminShn points out (+1), you can't get a standard deviation of age that large - 5.5 is the max you can achieve - with that range of data. – jbowman Jan 31 '23 at 19:42
  • 1
    @Amin great observation! In fact, the standard deviation of a bounded random variable is always less or equal than half the range of that variable. (Also this comment is obsolete because jbowman +1 was faster than me :)) – Maximilian Janisch Jan 31 '23 at 19:44
  • 1
    @MaximilianJanisch - I always knew that the speed typing class I took in high school would pay off someday! – jbowman Jan 31 '23 at 19:59
  • 1
    "possibly find flaws in" . . . I'm making a note here: Huge success. – Glen_b Jan 31 '23 at 21:28
  • 1
    I wonder whether the "standard deviation" is really the variance. For ages 9 - 20 rounded to a whole number and having nearly a uniform distribution, the variance ought to be close to 12, which is not far off from 16. – whuber Jan 31 '23 at 22:13
  • I'm referring to this paper, in particular Table 1, Line 3, which says that during an 8 year period in Finland, 15,266 females under the age of 20 had an abortion. According to the caption, the parenthesized figures next to the absolute counts are (SD) ("standard of deviation"), and there's a "(16.3)" next to the 15,266 figure. Other (SD)s also seem implausible in the line below. My question used an age range of 9 to 20, since girls under age 9 presumably can't be included. – jsbox Feb 01 '23 at 22:32
  • 1
    Ah, I figured it out. The parenthesized numbers are percentages of the top number in the column, not (SD)s. One caption says percentages are parenthesized, the other caption implies standard of deviation is parenthesized. – jsbox Feb 01 '23 at 22:51

0 Answers0