1

Let X denotes the number of successes in n trials and p is the probability of success.

Then, X ~ B(n, p) and EX = np, VarX = npq.

If n is really large (e.g. n > 20000) then it seems also reasonable to calculate the expected number of successes to be λ = np and model X as Poisson(λ).

Doing so we find EX = VarX = λ.

The two models provide the same expected value but different variances. I understand that the Binomial approach is the correct one.

The question is:

Is the Poisson model a false approach for this problem?

or:

What is the reason that the Poisson approach give "inflated" estimation for the variance?

1 Answers1

0

The important part isn't whether $n$ is large, it's whether $p$ is small.

The Binomial approach is correct, but the Poisson is a good approximation when $p$ is small. Thinking about the variances, if $p$ is small, $q=1-p\approx 1$, and $\lambda=np\approx npq$. The approximation gets better as $p$ gets smaller. For example, if $p=0.001$ then $q=0.999$ and the Poisson and Binomial variances are equal to three significant digits.

The same general result is true for a lot of other properties apart from the variance, but the proof is more complicated.

Thomas Lumley
  • 38,062
  • The summary at the outset is incorrect. It doesn't really matter whether $p$ is small: it matters, as your subsequent analysis shows, whether $np$ is small. – whuber Jun 22 '22 at 12:05
  • No, if $p$ is small and $n=1$, the approximation of Bernoulli(p) by Poisson(p) is excellent. The total variation difference is bounded by $\min{p,np^2}$ – Thomas Lumley Jun 23 '22 at 04:35
  • That's because in that case $np \approx p.$ The point is not whether it works, but the reasoning behind why it works. (The key concept is that the Poisson approximates Binomial distributions asymptotically as $n$ grows large and $np$ remains bounded.) Informing readers about such basic principles enables them to correctly identify and apply a solution in future problems. – whuber Jun 23 '22 at 12:14