I know the probability function for the beta distribution is $$p(x=k)=\frac{\prod_{i=1}^{k-1}(1-u+(i-1)\theta)}{\prod_{i=1}^{k}(1+(i-1)\theta)}$$ However I am unsure of how to derive the formula for the likelihood function of a data set with a beta geometric distribution. Can someone explain please?
-
If you have a dataset $(x_1,\ldots,x_n)$ the likelihood is $$\prod_{i=1}^n p(X=x_i)$$ – Xi'an Feb 17 '15 at 07:42
1 Answers
The formula is close to the beta-geometric distribution (not beta), which is a geometric distribution where the parameter $\theta$ is not constant, but it has a beta distribution. The formula is a reparametrization of parameters $(\alpha, \beta)$ from the beta distribution, into $u = \frac{\alpha}{(\alpha+\beta)}$ meaning as the mean parameter and $\theta = \frac{1}{(\alpha+\beta)}$ as the shape.
The formula would than be:
$$P(X=k)=\frac{u\prod_{i=1}^{k-1}(1-u+(i-1)\theta)}{\prod_{i=1}^k(1+u\theta)}$$
Note that it is not the same as yours. For furtehr reference you can check Parameter estimation of beta-geometric model with applications to human fecundability data - Singh, Pudir.
In the same paper they describe how they done that, but it does not provide all the reasoning, so I will provide that here to give you a complete response.
In order to build the likelihood function, they does not use the parametrization with $(u, \theta)$, but the one with $(\alpha, \beta)$. So the formula they used is:
$$ P(X=k) = \frac{B(\alpha+1,k+\beta)}{B(\alpha,\beta)}$$
In order to get the likelihood function you simply consider $\alpha, \beta$ as being random variables and $X$ as being fixed and known. So it is the same as:
$$L(\alpha,\beta|x)=\frac{B(\alpha+1,x+\beta)}{B(\alpha,\beta)}$$
Considering that you have a dataset $\mathcal{D} = \{x_1,x_2,..,x_n\}$ of observations, your question is how do you use the values from the data set into the likelihood function. One of the usual assumptions in this situation is that the data from the data set was drawn randomly from the same distribution. Thus, the values from $\mathcal{D}$ are independent. This last assertion means that the joint probability on the data set is the product of individual probabilities.
So
$$ L(\alpha,\beta|\mathcal{D}) = \prod_{i=1}^{n}\frac{B(\alpha+1, x_i+\beta)}{B(\alpha,\beta)}$$
The maximum likelihood approach means to find the values of the parameters as being the values where the likelihood function is has a maximum value. Because is more tractable, you can take natural logarithm of the likelihood function, since that will split some fractions there and usually make the formulas simpler (it works because the natural logarithm is a monothonic function of its input).
$$ LL(\alpha,\beta|\mathcal{D}) = \sum_{i=1}^{n}logB(\alpha+1,x_i+\beta) + n logB(\alpha,\beta) $$
A condition for the maximum is that the first derivative is zero. So the next step is to take partial derivatives of $\alpha$ and $\beta$, equal both of them with $0$ and solve the system of equations in order to get your values.
Hopes it helps.
- 6,974