Hypoexponential distribution is stuck because of $\lambda_i \neq \lambda_j$?

Question

The definition of the hypoexponential distribution (HD) requires that:

$$f(x)=\sum_i^d \left(\prod_{j=1,i\neq j}^{d}\frac{\lambda_j}{\lambda_j-\lambda_i}\right)\lambda_i e^{(-\lambda_ix)},\quad x>0 $$ is well-defined, which imposes that all $\lambda_i \neq \lambda_j$. As such, it seems to impose a significant limitation on the underlying exponentially distributed $X_i$'s in that if, say $X_i$ and $X_j$, have $\lambda_i=\lambda_j$, the distribution is no longer HD, neither can be Erlang, which requires that all $\lambda_i$ are the same!

I'm sure I'm missing something!

A second question: I tried to plot the HD using the syntax below for $\lambda = .1:20$ but not sure if it's the right syntax. Maybe you have experience this this R package ir can suggest another one for working with HD (generalized Erlang)?

library(sdprisk)
x <- seq(0.1, 20, 0.05)
y <- dhypoexp(x, rate = r, log = FALSE)
plot(x, y)

Take the limit as $\lambda_i\to\lambda_j$ to obtain a formula. You can do this using L'Hopital's Rule. Note, too, that because this distribution originates as a sum of independent Exponential distributions and those, in turn, are Gamma$(1)$ distributions, the approach through characteristic functions and partial fractions given at https://stats.stackexchange.com/questions/72479 will solve this problem. — whuber, Mar 02 '23 at 15:05
This function dhypoexp(x, rate = r, log = FALSE) seems ok, but maybe you can explain why you believe it is wrong. (btw, your example is missing how you define the rate r) — Sextus Empiricus, Mar 02 '23 at 16:18

Xi'an · Answer 1 · 2023-03-04T15:25:56.050

The distribution in question is a mixture of Exponential distributions $\mathcal E(\lambda_i)$ with specified weight, i.e. its density is $$f(x) = \sum_{i=1}^d p_i\,\lambda_i\exp\{-\lambda_i x\}\quad x\ge 0\tag{1}$$ with $$p_i=\prod_{j=1\\ j\ne i}^d\dfrac{\lambda_j}{\lambda_i-\lambda_j}\quad i=1,\ldots,d$$ There is therefore a single realisation produced by this density (rather than a sequence of $x_i$'s).

Note that, while $$\sum_{i=1}^d p_i= \sum_{i=1}^d \prod_{j=1\\ j\ne i}^d\dfrac{\lambda_j}{\lambda_i-\lambda_j}=1$$ (by the Cauchy determinant formula), the weights can be negative, which makes (1) a signed mixture.

Now, if $\lambda_1\approx\lambda_2$ (wlog), starting with the case $d=2$ leads to the Gamma $\mathcal G(2,\lambda_1)$ distribution: $$\lim_{\epsilon\to 0}\lambda_1(\lambda_1+\epsilon)\dfrac{e^{-\lambda_1x}-e^{-[\lambda_1+\epsilon]x}}{\epsilon}=\lambda_1^2xe^{-\lambda_1x}$$ $-$by L'Hospital's rule$-$as expected since this is the distribution of the sum two iid Exponential $\mathcal E(\lambda_1)$ random variates.

In the general case $d>2$, the undefined term in the density $$\lim_{\epsilon\to 0}\dfrac{e^{-\lambda_1x}}{\epsilon}\underbrace{\prod_{j=3}^d\dfrac{1}{\lambda_j-\lambda_1}}_\rho -\dfrac{e^{-[\lambda_1+\epsilon]x}}{\epsilon}\prod_{j=3}^d\dfrac{1}{\lambda_j-\lambda_1-\epsilon}$$ is equal to $$\lim_{\epsilon\to 0}\frac{\rho e^{-\lambda_1x}}{\epsilon\prod_{j=3}^d[\lambda_j-\lambda_1-\epsilon]} \left\{\prod_{j=3}^d[\lambda_j-\lambda_1-\epsilon]-e^{-\epsilon x}\rho^{-1}\right\} =\rho e^{-\lambda_1x}\left\{ x - \sum_{j=3}^d (\lambda_j-\lambda_1)^{-1}\right\}$$ by L'Hospital's rule.

The question concerns the case where this formula breaks down because one or more denominators are undefined. — whuber, Mar 02 '23 at 15:16

Sextus Empiricus · Accepted Answer · 2023-03-03T09:17:43.943

The definition of the hypoexponential distribution (HD) requires that:

$$f(x)=\sum_i^d \left(\prod_{j=1,i\neq j}^{d}\frac{\lambda_j}{\lambda_j-\lambda_i}\right)\lambda_i e^{(-\lambda_ix)},\quad x>0 $$

that expression is only a special case when $\lambda_i \neq \lambda_j$. It is not a requirement for the hypoexponential distribution in general.

The more general hypo-exponential distribution can be expressed as a phase-type distribution

$$f(x) = -\boldsymbol{\alpha}e^{x \boldsymbol{\Theta}} \boldsymbol{\Theta} \mathbf{1} $$

With $$\boldsymbol{\alpha} = (1,0,0,\dots,0,0)$$ and $$\boldsymbol{\Theta} = \begin{bmatrix} -\lambda_1 & \lambda_1 & 0 & \dots & 0 & 0 \\ 0 & -\lambda_2 & \lambda_2 & \dots & 0 & 0 \\ \vdots & \ddots & \ddots & \ddots & \ddots & \vdots\\ 0 & 0 & \ddots & -\lambda_{d-2} & \lambda_{d-2} & 0\\ 0 & 0 & \dots & 0 & -\lambda_{d-1} & \lambda_{d-1}\\ 0 & 0 & \dots & 0 & 0 & -\lambda_{d}\\ \end{bmatrix}$$

That involves the exponentiation of a matrix. And that can be approximated with a sum using a Taylor series.

You can see the matrix as modelling a sort of Markov chain process with non-discrete time steps. A sum of exponential distributed variables is like waiting for several consecutive transitions whose waiting time is each exponential distributed. Those transitions relate to the Markov chain.

Not only when $\lambda_i \neq \lambda_j$ but also when $\lambda_i = \lambda_j$ then the formula can give problems as demonstrated in this question: Why my cdf of the convolution of n exponential distribution is not in the range(0,1)?

Not only when $\lambda_i \neq \lambda_j$ but also when $\lambda_i \approx \lambda_j$, then the formula can give problems as demonstrated in this question: Why my cdf of the convolution of n exponential distribution is not in the range(0,1)? — Sextus Empiricus, Mar 02 '23 at 15:56
Indeed, that happened. Please define $\bf\Theta$ and $\bf\alpha$ for this case. — Oliver Amundsen, Mar 02 '23 at 15:57
And with $\bf{\alpha} = [\alpha_1, \alpha_2, \ldots \alpha_d]$ where each element is the participation of each $X_i$ in the hypoexponential, correct? e.g. $\alpha_i = 1/d$ — Oliver Amundsen, Mar 02 '23 at 18:02
@OliverAmundsen "the participation of each $X_i$" I am not sure what that means. For a hypo exponential distribution we start in the state 1 so $\boldsymbol{\alpha} = (1,0,0,\dots,0,0)$, the first $\alpha_1 =1$ and the others are zero. In terms of transitioning, the sum of exponential distributions is the waiting time until we have transitioned from the initial state to the final state. The total waiting time is the sum of waiting times for each individual transition. — Sextus Empiricus, Mar 02 '23 at 18:14

Hypoexponential distribution is stuck because of $\lambda_i \neq \lambda_j$?

2 Answers2