I understand that when n approaches infinity binomial distribution also approaches a Poisson distribution. What about the normal distribution? I googled and found that if n approaches infinity and p and q are large (not close to either 1 or 0) then it follows a normal distribution. But if p is large np should also be infinite. So it seems to me that normal distribution can only approximate the binomial distribution when n, p, and q all are finite. Is my reasoning correct?
-
"if p is large np should also be infinite"—why do you believe that? – Arya McCarthy Apr 03 '22 at 23:02
-
let's say p=0.1, if n is infinite then n*0.1 = infinite – Ahmed Abdullah Apr 03 '22 at 23:06
-
1What if $p$ approaches zero as $n$ approaches infinity? – Dave Apr 03 '22 at 23:09
-
1Nobody's saying that $n$ is infinite. It's approaching infinity, in the limit. – Arya McCarthy Apr 03 '22 at 23:13
-
@Dave In that case, what is the difference between the Poisson and the normal? – Ahmed Abdullah Apr 03 '22 at 23:16
-
You might want to look at https://stats.stackexchange.com/questions/519029/does-a-binomial-converge-to-poisson-or-normal – Henry Apr 03 '22 at 23:34
-
@Henry I have looked at the question you have mentioned. If np is small but not infinitesimal (smaller than 4), the binomial distribution is asymmetrical. Even if n very very large I don't understand how it'll look like bell curve. So my question is why CLT fails here? – Ahmed Abdullah Apr 04 '22 at 02:00
-
1Also relevant ... https://stats.stackexchange.com/questions/90422/poisson-vs-binomial-for-rare-events – Glen_b Apr 04 '22 at 06:01
-
Your reasoning is fundamentally flawed. "It" has multiple meanings here, because you are asking about two (radically) different sequences of random variables. In the first case the properties of those variables change with $n,$ whereas in the second case (Central Limit Theorem) the properties remain fixed. The moral here is that it's crucial to learn the context and assumptions that lead to any general conclusion. – whuber Apr 04 '22 at 19:33
1 Answers
Traditionally, normal approximations have been used to to get serviceable answers to such problems as finding $P(X \le 7),$ where $X \sim \mathsf{Binom}(n = 30, p = 1/6).$ Extensive tables of binomial CDFs are relatively rare, and using the binomial PDF formula to find the necessary eight terms seems excessively tedious.
Several rules of thumb have been proposed to limit the use of normal approximations to situations in which something like two place accuracy is possible. One of the most common guidelines is that $\min[np, n(1-p)] \ge 5,$ which helps to ensure that the approximating normal distribution puts most of its probability on $(0,n).$ It is not always mentioned that approximations tend to be best if $p$ is near $1/2,$ so that the binomial distribution is nearly symmetrical.
The normal approximation for the specific problem above is as follows:
$$P(X \le 7) = P(X < 7.5) = P\left( \frac{X-np} { \sqrt{np(1-p)} } < \frac{7.5 - 5}{ \sqrt{25/6} } = 1.225\right)\\ \approx P(Z < 1.225) = 0.8897,$$
where $Z$ is standard normal, the first step is known as a continuity correction, and the last would require interpolation, using most printed standard normal CDF tables.
In R, an exact computation is simply $P(X \le 7) = 0.8863.$
pbinom(7, 30, 1/6)
[1] 0.8863132
In the figure below, the exact probability is the sum of the heights of the vertical bars to the left of the dotted vertical line. The normal approximation is the area under the density curve to the left of that line.
R code for figure:
x = 0:30; pdf = dbinom(x, 30, 1/6)
hdr = "BINOM(30, 1/6) with Normal Approx"
plot(x, pdf, type="h", lwd=3, col="blue", main=hdr)
abline(h=0, col="green2")
abline(v=0, col="green2")
abline(v=7.5, col="orange", lwd=2, lty="dotted")
curve(dnorm(x, 5, sqrt(25/6)), add=T, lwd=2, col="brown")
Note: A Poisson approximation to a binomial probability is often useful when $n$ is large and the Poisson mean $\mu = np$ is of moderate size. In the current example, $n$ is not large enough for a good Poisson approximation. $P(Y \le 7) = 0.8666$, where $Y\sim\mathsf{Pois}(\lambda = 5).$
ppois(7, 5)
[1] 0.8666283
k = 0:7; sum(exp(-5)*5^k/factorial(k))
[1] 0.8666283
- 56,185
