2

I understand that If I add an infinite number of variables the limiting standardised distribution is standard normal. This comes from CLT

I also understand that the summation of independent Poissons is Poisson with a rate parameter equal to the individual rate parameters of summand distributions

These two prima facie seem to contradict each other. However, in the CLT we add iid distributions that are fixed apriori. The example case here is that of addition of infinite iid Bernoulli. For, CLT $n \to \infty$, $p$ remains constant. And, in the case of Poisson $np \to \lambda$ ($n \to \infty$ and $p \to 0$)

The real-life examples are however more like snapshots that do not tell us the way the limits are approached. Example: I have 20,000 samples with $p(X=1) = 0.00005$, or we have 100 samples with $p(X=1) = 0.1$ and so on and so forth

My question is when to model the system as a Poisson and when is it more prudent to model the system using a normal if all we have available is a value of $n$ and $p$?

2 Answers2

3

The apparent contradiction is not there, because a Poisson random variable with high mean can be closely approximated by a normal distribution! Just try some simulations and see for yourself ... and/or see Normal approximation to the Poisson distribution.

As for the modeling question, see for instance Goodness of fit and which model to choose linear regression or Poisson

0

If your data is counts, you choose a model appropriate for the count data, like Poisson regression. If the outcome is continuous, you use the appropriate model, line linear regression. Linear regression would be a wrong model (can predict negative counts), but nonetheless can be useful in some situations (simplicity).

Tim
  • 138,066