4

If I roll a 6-sided die x times (or roll x 6-sided dice at once), what is the probability that the sum of the result is greater than another y rolls (or one roll of y dice)?

  • Parts of the solution are given at https://stats.stackexchange.com/questions/291549, https://stats.stackexchange.com/questions/392943, and https://stats.stackexchange.com/questions/3614. https://stats.stackexchange.com/a/116913/919 includes software to (easily and naturally) compute the answer. Once x and y grow large (around 10 or so), Normal approximations work well (provided some care is taken also to estimate the chance of a tie). – whuber Oct 09 '21 at 15:08
  • 2
    Note that for fair six-sided dice, if V is the outcome on one die, 7-V has the same probability distribution as V. So you can use this to get all the dice terms on one side of the inequality. Let $S_x$ and $T_y$ be the totals for the x and y dice respectively. Then $P(S_x>T_y)$ $=$ $P(S_x>7y-T_y)$ $=$ $P(S_x+T_y>7y)$. That is, it's the same as the probability that the total on $x+y$ dice exceeds $7y$. So for example, the chance that a sum on 3 dice exceeds a sum on 2 dice is the same as the sum of 5 dice exceeding 14. ... ctd – Glen_b Oct 09 '21 at 23:35
  • ... ctd ... Now the probabilities of sums on small numbers of dice are easy to generate (e.g. in a spreadsheet, such as described here: https://stats.stackexchange.com/questions/3614/how-to-easily-determine-the-results-distribution-for-multiple-dice/3625#3625). This is fairly quick and easy up to about 10 total dice or so. Beyond that a normal approximation will usually be adequate, but some of the methods whuber points to are better if you need good accuracy further into the tails. – Glen_b Oct 09 '21 at 23:37
  • Thank you that helped me very much. I guess using Troll software to determine the probability distribution of sum of dice is the best way to go. Very interesting that probability of x dice exceeds y dice is the same as x+y dice exceeds 7y, I would never have guessed that – Fabian Allendorf Oct 10 '21 at 11:27
  • You misread Glen_b's comment: please distinguish "$T_y$" from "$y.$" – whuber Mar 27 '24 at 16:38

1 Answers1

1

To give an answer, here is some R code, both doing the calculation exactly and with a normal approximation with continuity correction.

probdice <- function(numberdice, sides=6){
  probmatrix <- matrix(0, ncol=numberdice, nrow=numberdice*sides)
  probmatrix[1:sides, 1] <- 1 / sides
  for (d in 2:numberdice){
    for (s in 1:sides){
      probmatrix[(d-1+s):((d-1)*sides+s), d] <- 
          probmatrix[(d-1+s):((d-1)*sides+s), d] + 
          probmatrix[(d-1):((d-1)*sides), d-1] / sides
      }
    }
  return(probmatrix)
  }

probXgtY <- function(X, Y, sides=6){ distXplusY <- probdice(X+Y, sides)[, X+Y] XgtY <- c(sum(distXplusY[(1:((X+Y)sides)) > 7Y]),
sum(distXplusY[(1:((X+Y)sides)) == 7Y]), sum(distXplusY[(1:((X+Y)sides)) < 7Y])) names(XgtY) <- c("P[sum(X)>sum(Y)]", "P[sum(X)=sum(Y)]", "P[sum(X)<sum(Y)]")

return(XgtY) }

pnormalapprox <- function(X, Y, sides=6){ meandiff <- (X-Y) * (sides+1) / 2 sddiff <- sqrt((X+Y) * (sides^2-1) / 12) PXgtY <- c(1 - pnorm(1/2, meandiff, sddiff), pnorm(1/2, meandiff, sddiff) - pnorm(-1/2, meandiff, sddiff), pnorm(-1/2, meandiff, sddiff)) names(PXgtY) <- c("P[sum(X)>sum(Y)]","P[sum(X)=sum(Y)]","P[sum(X)<sum(Y)]") return(PXgtY) }

Trying with $X=51$ dice and $Y=49$ dice gives these probabilities. The normal approximation with continuity correction is close.

probXgtY(51,49)
# P[sum(X)>sum(Y)] P[sum(X)=sum(Y)] P[sum(X)<sum(Y)] 
#       0.64805704       0.02145462       0.33048834

pnormalapprox(51,49)

P[sum(X)>sum(Y)] P[sum(X)=sum(Y)] P[sum(X)<sum(Y)]

0.64825034 0.02147506 0.33027460

With small numbers of dice, the normal approximation is not bad, for example with $X=3,Y=2$:

probXgtY(3,2)
# P[sum(X)>sum(Y)] P[sum(X)=sum(Y)] P[sum(X)<sum(Y)] 
#       0.77854938       0.06944444       0.15200617

pnormalapprox(3,2)

P[sum(X)>sum(Y)] P[sum(X)=sum(Y)] P[sum(X)<sum(Y)]

0.78394450 0.06860851 0.14744699

In a more extreme cases in the tail of the distribution, the normal approximation is good in absolute terms but can be poor in relative terms, for example with $X=40,Y=80$:

probXgtY(40,80)
# P[sum(X)>sum(Y)] P[sum(X)=sum(Y)] P[sum(X)<sum(Y)] 
#     6.444190e-15     3.694881e-15     1.000000e+00

pnormalapprox(40,80)

P[sum(X)>sum(Y)] P[sum(X)=sum(Y)] P[sum(X)<sum(Y)]

2.953193e-14 1.487699e-14 1.000000e+00

Henry
  • 39,459
  • Alternatively, you could use the code I posted at https://stats.stackexchange.com/a/116913/919 to compute (say) mean(d(40,6) > d(80, 6)) or mean(d(40,6) == d(80, 6)) or whatever: that's fairly convenient. – whuber Mar 27 '24 at 16:45