Sum of values with different probabilities

Question

Suppose I have the following linear expression: $S = x_1 + x_2+ \dots + x_n$, in which each $x_i$ can only assume the following values: -2, -1, 0, 1, 2 whose probabilities are 0.1, 0.2, 0.2, 0.25, 0.25 respectivelly. I want to know two things:

1 - Is there an analytical way to express how many combinations of $x_i$ make $S \geq k$ given the value of $n$?

2 - What is the probability $p(S \geq k)$?

Let's say the probabilities are 0.1, 0.2, 0.2, 0.25, 0.25 for the respective values -2, -1, 0, 1, 2 and $k=30$. — donut, Apr 21 '22 at 16:32
I am confused about what you are asking. If you are interested in counting the number of ways that $S \geq k$ given the possible levels of $x_i$, you do not need probabilities at all! If you are interested in the probability that $S \geq k$ then yes, you will certainly need a probability model. Please clarify what you are asking. — Galen, Apr 21 '22 at 16:35
For basic concepts see https://stats.stackexchange.com/questions/331973. For practical calculations see, for instance, our posts using the convolve function. Finally, once $n$ is greater than $8,$ approximately, consider a Normal approximation. — whuber, Apr 21 '22 at 17:31
For what it may be worth (e.g., checking an analytic answer), simple simulation for specific $n, k$ gives approximate answers: In R, code set.seed(2022); n = 60; k = 30; s = replicate(10^6, sum(sample(-2:2, n, rep=T, c(2,4,4,5,5)))); mean(s >= k) returns $ 0.202463.$ About $0.202.$ // Norm aprx also gives about $0.202.$ — BruceET, Apr 21 '22 at 21:51
@Bruce Convolution (via the FFT) will be more efficient than simulation until $n$ gets quite large: apart from floating point roundoff error, it gives exact answers, too. — whuber, Apr 22 '22 at 16:48

score 2 · Answer 1 · answered Jul 21 '23 at 03:00

The solution (assuming independence, which was not stated) can be found via numerical convolution. In the following we use some R code. When summing $N$ independent copies, the possible range for the sum $S$ is the integers in range from $-2N$ to $2N$.

p <- c(.1, .2, .2, .25, .25)
conv_N <- function(x, N) {
  inter <- x
  for (j in seq(from=1, length.out=N-1)) {
     inter <- convolve(inter, rev(x), type="open")
  }
  return(inter)
} # end conv_N
Example with N=5:
p_5 <- conv_N(p, 5) 
x_5 <- -10:10
cbind(x_5, p_5) |> round(3)
      x_5   p_5
 [1,] -10 0.000
 [2,]  -9 0.000
 [3,]  -8 0.001
 [4,]  -7 0.002
 [5,]  -6 0.005
 [6,]  -5 0.011
 [7,]  -4 0.022
 [8,]  -3 0.038
 [9,]  -2 0.060
[10,]  -1 0.085
[11,]   0 0.109
[12,]   1 0.126
[13,]   2 0.132
[14,]   3 0.124
[15,]   4 0.105
[16,]   5 0.079
[17,]   6 0.052
[18,]   7 0.029
[19,]   8 0.014
[20,]   9 0.005
[21,]  10 0.001

Sum of values with different probabilities

1 Answers1

Example with N=5: