19

Let $x_i$ be independent Bernoulli random variables with success probabilities $p_i$. That is, $x_i=1$ with probability $p_i$ and $x_i=0$ with probability $1-p_i$.

Is there a closed expression or an approximate formula for the distribution of the sum $\sum_i x_i$?

whuber
  • 322,774
a06e
  • 4,410
  • 1
  • 22
  • 50
  • 4
    If the $p_i$ are very small, you can use Poisson approximation. Let $X_i\sim \mbox{Be}(p_i)$ be independent and let $Y\sim\mbox{Po}(\lambda)$ with $\lambda=\sum_{i=1}^np_i$. In a classic paper by Hodges and Le Cam it is shown that $|\mbox{P}(\sum_{i=1}^n X_i\leq x)-\mbox{P}(Y\leq x)|=3\cdot (\max_{1\leq i\leq n}p_i)^{1/3}.$ If the $p_i$ are all close to 0, this difference is small. – MånsT Apr 15 '14 at 13:15
  • 1
    In addition to the duplicate, solutions appear at http://stats.stackexchange.com/questions/41247 (computational methods) and http://stats.stackexchange.com/questions/5347 (approximations for large numbers of variables). – whuber Apr 15 '14 at 16:59
  • 1
    @MånsT Hodges and Le Cam's result you state is incorrect. The equality is less than or equal to!!. – Chamberlain Mbah Oct 19 '17 at 20:38
  • @Chamberlain: you are absolutely right! I can't edit it now though, as my comment is too old. – MånsT Oct 20 '17 at 06:23

2 Answers2

19

Yes, in fact, the distribution is known as the Poisson binomial distribution, which is a generalization of the binomial distribution. The distribution's mean and variance are intuitive and are given by

$$ \begin{align} E\left[\sum_i x_i\right] &= \sum_i E[x_i] = \sum_i p_i\\ V\left[\sum_i x_i\right] &= \sum_i V[x_i] = \sum_i p_i(1-p_i). \end{align} $$

The expectation is straightforward because it is a linear operator. The variance is also straightforward because of the independence assumption.

ramhiser
  • 2,043
  • 2
    And if you sum up "enough" of those 0/1 random variables, you can approximate the resulting poisson binomial with a normal distribution with the same mean and variance. https://math.stackexchange.com/questions/2375886/approximating-poisson-binomial-distribution-with-normal-distribution – qkhhly Oct 28 '20 at 22:48
-1

I'm not aware of a closed formula to exist. If n becomes relevant you can apply Central Theorem Limit so approximating the sum distribution with a normal distribution having mean the sum of p_i and variance the sum of p_i * ( 1 - p_i).

  • 5
    The CLT often fails (that is, does not even apply) in this circumstance unless the $p_i$ remain away from $0$ and $1$. – whuber Apr 15 '14 at 17:01