2

Let's $X_1, X_2, ..., X_n$, $n=1,2,...$ are independent discrete random variables.

It is necessary to find the distribution law of the their sum:

$p(k) =P(X_1 + X_2 + ... + X_n = k), k=0, 1, 2, ... $

I solved the problem case $n=3$ when random variables take the same values $x=\{0,1,2\}$ ​​and probabilities $p =\{0.85, 0.1, 0.05\}$ using the convolution twice.

x <- x0 <- c(0:2)             # value's of random variable 
p <- p0 <- c(0.85, 0.1, 0.05) # probabilities

n = 2

for(i in 1:n){ p1 <- outer(p0, p) d1 <- outer(x0, x, +) z1 <- tapply(p1, d1, sum) p <- z1 x <- as.integer(names(z1)) } z1

0 1 2 3 4 5 6

0.614125 0.216750 0.133875 0.026500 0.007875 0.000750 0.000125

Question. I am looking for a function/packages for obtaining the density of a sum of independent discret random variables.

Nick
  • 858
  • What do you know about the distributions of the variables besides that they are discrete? – Tim Mar 25 '22 at 08:03
  • They are independed. Just a small table with values and probabilities. – Nick Mar 25 '22 at 08:06
  • 1
    The probability function (not density!) of the sum is the convolution of their probability functions. When the values lie on a lattice (e.g., they are all integers) this is efficiently computed using the Fast Fourier Transform. That is implemented in R as fft. Look also at the convolve function. See, inter alia, https://stats.stackexchange.com/questions/41251, https://stats.stackexchange.com/questions/41247, https://stats.stackexchange.com/questions/191193, and https://stats.stackexchange.com/questions/5347. – whuber Mar 25 '22 at 14:43

1 Answers1

1

There are many kinds of discrete distributions. The result would be simple for some of them, for example with $X_i \sim \mathcal{Pois}(\lambda_i)$ we know that $\sum_i X_i \sim \mathcal{Pois}(\sum_i \lambda_i)$. But for others, it might be more complicated.

You seem to be referring to the categorical distribution. With two categories, it would be a Bernoulli distribution, where the sum of i.i.d. Bernoulli's would follow a binomial distribution, but if they are independent but not identically distributed it quickly gets pretty nasty and results in Poisson-binomial distribution. Because of that, in many cases, we would rather approximate the distribution than calculate the probabilities directly. Sum of i.i.d. categorical distributions result in multinomial distribution, and if they are not identically distributed it is a Poisson-multinomial distribution (see e.g. Lin, Wang, and Hong, 2022) that again, is pretty complicated. Hopefully, there seems to be an R package implementing it that exposes both direct and approximate methods for calculating it (I never used it, so don't take it as a recommendation).

Tim
  • 138,066