5

I typically encountered gamma distributions to model response time after a certain event. As far as my statistics goes, that is its natural place. However, in a recent piece of work of mine, I found gamma distribution perfectly modeling the number of order lines within a store order. Why is that?

Definitions for the general reader: when a customer places an order for multiple items of different sorts - say 2 pairs of socks of the same brand/color and 1 shirt - then you say you have two orderlines, one for 2 socks, and one for 1 shirt. A store typically places an order with tens, hundreds or even thousands or orderlines to replenish its stock.

famargar
  • 851
  • 1
    Welcome to Cross Validated! As the number of order lines is discrete, a continuous Gamma random variable can hardly model it perfectly. – Scortchi - Reinstate Monica Jan 03 '17 at 16:34
  • Hi Scortchi. I clearly abused of the word "perfectly" when describing the quality of the model. However, the gamma distribution does interpolate very well between the probabilities of the discrete values my variable takes, just as the normal distribution interpolates very well between the ball counts in each bin in a Galton machine. Wonder if my business problem can be translated in a more general problem where the Gamma is expected to work well. – famargar Jan 03 '17 at 16:40
  • 2
    The relevant chapter in McCullagh and Nelder Generalized Linear Models is called Models for data with constant coefficient of variation so i suppose that is what you have. Of their three examples two are times as you mention but the other is cost of insurance claims in GBP. – mdewey Jan 03 '17 at 17:08
  • 2
    Well, suppose the true distribution were a negative binomial ... – Scortchi - Reinstate Monica Jan 03 '17 at 17:20

1 Answers1

7

According to Wikipedia, "the negative binomial distribution is sometimes considered the discrete analogue of the Gamma distribution". (See also comment by Scortchi.)

It has similar interpretations to the Gamma distribution in terms of "waiting times".


Note that for a Gamma distribution with shape parameter $\alpha$ and rate parameter $\beta$, the mean and variance are $$\mu=\frac{\alpha}{\beta}\,,\,\sigma^2=\frac{\alpha}{\beta^2}$$ while for a negative Binomial distribution with success probability $p$ and number of failures $r$, the mean and variance are $$\mu=\frac{pr}{1-p}\,,\,\sigma^2=\frac{pr}{(1-p)^2}$$ So equating the two would give $$\alpha\approx{pr}\,,\,\beta\approx{1-p}$$ (Note however that this may not be a great approximation for all parameter values, so it would be better to estimate the negative binomial parameters from your data directly.)

GeoMatt22
  • 12,950