1

In the article Bayes or not Bayes, is this the question?, in the paragraph before Figure 2, the author says:

Let us suppose that we want to investigate whether the sex ratio in hypothetical mice population is 1:1. We can create two experimental designs. In the first experiment, we can randomly select a mouse until the first male is chosen. The result in this experiment is the total number of mice chosen. In the second experiment, we can randomly select exactly seven mice. The result of this experiment will be the number of male and female mice in a sample of seven. Let us suppose that the result was FFFFFFM. If we do not know what experimental design was used, this result is useless. In the first experiment, the P is 0.031, but in the second experiment, the P value is 0.227.

Checking this, assuming a null hypothesis of equal probability of male or female mice, the first should be the probability of needing $m\ge7$ trials to obtain a success (inclusive of success trial), and from the CDF:

$$ P_1 = 0.5^{m-1}\times 0.5 = 0.0039 $$

Thus the probability of needing $\ge 7$ trials under the null hypothesis is 0.0039 (0.39%).

While the second should be a binomial probability of $k$ successes in $N$ trials, with $k=0,1;N=7$, given by

$$ P_2 = \sum_{k=0}^{1} \binom{N}{k} 0.5^k 0.5^{N-k} = \binom{7}{1} 0.5^7 + \binom{7}{0} 0.5^7 = 0.063 $$

Where the probabilities $p=q=0.5$ under the null hypothesis. Thus the probability of pulling $\le 1$ male mice in a sample of 7 under the null hypothesis is 0.063 (6.3%).

But these are very far from the P-values in the article (0.0039 instead of 0.031, 0.063 instead of 0.227). Are these probabilities not the same as the author's intended p-values?

  • 1
    Although there are problems with the quotation, there are errors in your calculations, too. The p-value is not the chance of the outcome. It is the chance an outcome as or more "extreme" than the outcome, as explained at https://stats.stackexchange.com/questions/31. Also, in the second case the alternate hypothesis is two-sided rather than one-sided. – whuber Aug 26 '22 at 15:46
  • @whuber thanks, for the first (geometric distribution) I think I'm in agreement with your note since I added the cases of $m=7$ and $m>7$, the latter being negligible. I also checked against the geometric CDF to be sure. For the second (binomial), I'm a little new to hypothesis testing so bare with me, but is it two sided because it could be more or less than 1 success? I also added the probability of no successes to get the CDF for that one, so the total is then the probability of getting less than or equal to 1 success, which should be 6.3% right? – Sam Gallagher Aug 26 '22 at 20:09
  • 1
    The hypothesis is that the M:F ratio is 1:1. The alternative could go in either direction. The p-value for observing one M and 6 Fs is the same as the p-value for observing one F and 6 Ms; both equal $1/8$ when you use the best possible test, which is based on the absolute difference between the number of M's and number of F's out of the 7 observations. – whuber Aug 26 '22 at 21:40
  • @whuber I've made some edits, but I don't fully understand the two-sided alternative hypothesis yet. The null hypothesis is equal probabilities of male and female, does the p-value (probability of obtaining a result "at least as extreme" as observed) depend on the alternative hypotheses as well as the null hypothesis? Because of what it means to be "more extreme"? – Sam Gallagher Aug 28 '22 at 14:56
  • Yes, the p-value depends on the alternate hypothesis. I explain this (in detail) at https://stats.stackexchange.com/a/130772/919. – whuber Aug 28 '22 at 15:00
  • @whuber From what I understand, because the expectation is 3.5 males, anything above or below is evidence against the null hypothesis, so without an explicit alternative (e.g. that the population favors females), I should take a p-value that includes both tails, even if we only observe deviation in one direction. Btw your answer linked was a great read, but it makes it sound like Neyman-Pearson testing is the only option? Will N-P give different results than Fisher's tests of significance in this case? Or am I safe to stick to Fisher's hypothesis tests – Sam Gallagher Sep 01 '22 at 12:43

0 Answers0