Ten tosses of a coin. Test $H_0: p = 1/2$ against $H_0: p \ne 1/2.$ Comment at the start: there is not a lot of information in only ten tosses of a coin, so in order to reject $H_0$ we will have to observe very few heads (0 or 1) or very many (9 or 10).
Normal approximation: Under $H_0,$ Number $X$ of heads see in $n = 10$ independent tosses has $X \sim\mathsf{Binom}(n=10,p=1/2,$ which
has $\mu = E(X) = np = 5,$ and $\sigma = \sqrt{np(1-p)} = \sqrt{2.5} = 1.581139.$ Then $Z = \frac{X=\mu}{\sigma} \stackrel{aprx}{\sim}
\mathsf{Norm}(0,1).$ So we reject $H_0$ at about the 5% level by
rejecting for $|Z| \ge 1.96.$
Example: Suppose you observe $x = 3$ heads in $n = 10$ tosses.
then $|Z|=|\frac{3-5}{1.5811}| = |-1.265| < 1.96,$ so you do not
have sufficient evidence to reject $H_0$ at the 5% level of significance.
Exact binomial test. This two-sided test rejects $H_0$ when
$X$ is sufficiently far from the expected value $\mu=5$ under $H_0.$
For observed value $x,$ the P-value is $P(X \le x)+P(X \ge n-x).$
Example: Same as above: $x = 3.$
We seek $P(X \le 3) + P(X \ge 7) = 0.3438 > 0.05 = 5\%,$ so we do
not reject $H_0$ at the 5% level of significance.
sum(dbinom(c(0:3, 7:10), 10, .5))
[1] 0.34375
This exact binomial test is implemented in R as 'binom.test', which gives the same P-value $0.3438$ that we obtained from the binomial distribution above.
binom.test(x=3, n=10, p=.5)
Exact binomial test
data: 3 and 10
number of successes = 3, number of trials = 10, p-value = 0.3438
alternative hypothesis:
true probability of success is not equal to 0.5
95 percent confidence interval:
0.06673951 0.65245285
sample estimates:
probability of success
0.3
Notes: (a) Back to the comment at the beginning. For the exact
binomial test the rejection region for $H_0: p=.5$ against
$H_a: p \ne .5$ is to observe $0,1,9,$ or $10$ Heads, so
this is really a test at about the 2% level. (Including $2$ and $8$ in the rejection region would make it a test at about the 11% level. Because of the discreteness of binomial distributions a straightforward test at the 5% level is not
possible.
sum(dbinom(c(0,1,9,10), 10, .5))
[1] 0.02148437
sum(dbinom(c(0,1,2,8,9,10), 10, .5))
[1] 0.109375
Thus the power to detect as biased a coin with P(H) = 0.3
is only about 15%.
sum(dbinom(c(0,1,9,10), 10, .3))
[1] 0.149452
(b) There are two difficulties with an approximate normal test in this situation. (i) $n = 10$ is not quite large enough
to guarantee a good approximation to binomial probabilities.
(ii) The approximate test may make it appear that a test at the 5% level is possible, but with $n=10, p=0.5$ values of $|Z|$ near
1.96 are not possible, so the actual significance level
is closer to 2% than 5%.