8

I've always wondered how good a 'fit' is the Poisson distribution to the events we observe in reality. Almost always I've seen it be used for modeling occurrence of events. (For example, arrival of cars in a parking garage or the number or messages sent/received by computers hosts on a network etc.)

We usually model such events by the Poisson Distribution. Is the distribution just a good first approximation to how things happen in reality? If I observe the number of cars/day or messages/day in the above two examples and those that are output by 'picking from the distribution' how much do they differ? How good an approximation is Poisson? (Is it an approximation?) What is the 'magic' behind Poisson that it just gets it right (intuitively speaking :)?

PhD
  • 14,627
  • 4
    There are some good starting points if you google derivation of poisson distribution, that show how Poisson is magically derived from binomial distribution where n is large and the chance of an event is small. From there it starts to make sense to use it to model count events. The question I guess is how well do real count events match that smooth extension of the binomial situation. – Peter Ellis Nov 20 '12 at 03:40

4 Answers4

5

One example I can speak for is supermarket sales of Consumer Packaged Goods (CPG). These are also count events - the supermarket may sell 0 units a day, or 1, or 2 and so on, so the Poisson distribution seems like a good first fit.

However, the underlying binomial distribution @PeterEllis notes does not hold. Yes, we may be able to model the number of customers with a binomial... but some customers will buy 1 unit, some will buy 2 units and some will load their pantries and buy 10 units.

The result will usually be overdispersed, so that a negative binomial distribution fits much better than a Poisson one. (Occasionally, we may even see underdispersion for very fast moving items like milk).

Stephan Kolassa
  • 123,354
  • 3
    +1. Just thought it was worth mentioning that the Poisson is a special case of the Negative Binomial and that one way of deriving the negative binomial is as a mixture of many different Poisson distributions with different means. – David J. Harris Nov 20 '12 at 21:15
4

If the things being counted are independent of each other and the rate is constant (or follows a model like in poisson regression) then the Poisson distribution will generally hold quite well. Examples like cars arriving at a garage tend to work fairly well (over periods of time that the rate is fairly constant, including both rush hour and the middle of the night for a garage frequented by 9 to 5 workers would not work well). What time you arrive at the garage will have little or influence on what time I arrive. There are exceptions however in that if 2 people arrange to meet at a given time then they are likely to arrive closer together, if one follows the other then they will be even closer. Also things like a nearby traffic light could cause clumps in the arrivals that would not match a Poisson.

If you want to compare a specific dataset to see if the Poisson is a good match then you can use a hanging rootogram.

Greg Snow
  • 51,722