The following answer is a bit boring but seems to be the only one to date that contains the genuinely exact answer! Normal approximation or simulation or even just crunching the exact answer numerically to a reasonable level of accuracy, which doesn't take long, are probably the better way to go - but if you want the "mathematical" way of getting the exact answer, then:
Let $X$ denote the sum of dots we see in $100$ die rolls, with probability mass function $p_X(x)$.
Let $Y$ denote the number of heads in $600$ coin flips, with probability mass function $p_Y(y)$.
We seek $P(X > Y) = P(X - Y > 0) = P(D > 0)$ where $D = X - Y$ is the difference between sum of dots and number of heads.
Let $Z = -Y$, with probability mass function $p_Z(z) = p_Y(-z)$. Then the difference $D = X - Y$ can be rewritten as a sum $D = X + Z$ which means, since $X$ and $Z$ are independent, we can find the probability mass function of $D$ by taking the discrete convolution of the PMFs of $X$ and $Z$:
$$p_D(d) = \Pr(X + Z = d) = \sum_{k =-\infty}^{\infty} \Pr(X = k \cap Z = d - k) = \sum_{k =-\infty}^{\infty} p_X(k) p_Z(d-k) $$
In practice the sum only needs to be done over values of $k$ for which the probabilities are non-zero, of course. The idea here is exactly what @IlmariKaronen has done, I just wanted to write up the mathematical basis for it.
Now I haven't said how to find the PMF of $X$, which is left as an exercise, but note that if $X_1, X_2, \dots, X_{100}$ are the number of dots on each of 100 independent dice rolls, each with discrete uniform PMFs on $\{1, 2, 3, 4, 5, 6\}$, then $X = X_1 + X_2 + \dots + X_{100}$ and so...
# Store the PMFs of variables as dataframes with "value" and "prob" columns.
# Important the values are consecutive and ascending for consistency when convolving,
# so include intermediate values with probability 0 if needed!
# Function to check if dataframe conforms to above definition of PMF
# Use message_intro to explain what check is failing
is.pmf <- function(x, message_intro = "") {
if(!is.data.frame(x)) {stop(paste0(message_intro, "Not a dataframe"))}
if(!nrow(x) > 0) {stop(paste0(message_intro, "Dataframe has no rows"))}
if(!"value" %in% colnames(x)) {stop(paste0(message_intro, "No 'value' column"))}
if(!"prob" %in% colnames(x)) {stop(paste0(message_intro, "No 'prob' column"))}
if(!is.numeric(x$value)) {stop(paste0(message_intro, "'value' column not numeric"))}
if(!all(is.finite(x$value))) {stop(paste0(message_intro, "Does 'value' contain NA, Inf, NaN etc?"))}
if(!all(diff(x$value) == 1)) {stop(paste0(message_intro, "'value' not consecutive and ascending"))}
if(!is.numeric(x$prob)) {stop(paste0(message_intro, "'prob' column not numeric"))}
if(!all(is.finite(x$prob))) {stop(paste0(message_intro, "Does 'prob' contain NA, Inf, NaN etc?"))}
if(!all.equal(sum(x$prob), 1)) {stop(paste0(message_intro, "'prob' column does not sum to 1"))}
return(TRUE)
}
# Function to convolve PMFs of x and y
# Note that to convolve in R we need to reverse the second vector
# name1 and name2 are used in error reporting for the two inputs
convolve.pmf <- function(x, y, name1 = "x", name2 = "y") {
is.pmf(x, message_intro = paste0("Checking ", name1, " is valid PMF: "))
is.pmf(y, message_intro = paste0("Checking ", name2, " is valid PMF: "))
x_plus_y <- data.frame(
value = seq(from = min(x$value) + min(y$value),
to = max(x$value) + max(y$value),
by = 1),
prob = convolve(x$prob, rev(y$prob), type = "open")
)
return(x_plus_y)
}
# Let x_i be the score on individual dice throw i
# Note PMF of x_i is the same for each i=1 to i=100)
x_i <- data.frame(
value = 1:6,
prob = rep(1/6, 6)
)
# Let t_i be the total of x_1, x_2, ..., x_i
# We'll store the PMFs of t_1, t_2... in a list
t_i <- list()
t_i[[1]] <- x_i #t_1 is just x_1 so has same PMF
# PMF of t_i is convolution of PMFs of t_(i-1) and x_i
for (i in 2:100) {
t_i[[i]] <- convolve.pmf(t_i[[i-1]], x_i,
name1 = paste0("t_i[[", i-1, "]]"), name2 = "x_i")
}
# Let x be the sum of the scores of all 100 independent dice rolls
x <- t_i[[100]]
is.pmf(x, message_intro = "Checking x is valid PMF: ")
# Let y be the number of heads in 600 coin flips, so has Binomial(600, 0.5) distribution:
y <- data.frame(value = 0:600)
y$prob <- dbinom(y$value, size = 600, prob = 0.5)
is.pmf(y, message_intro = "Checking y is valid PMF: ")
# Let z be the negative of y (note we reverse the order to keep the values ascending)
z <- data.frame(value = -rev(y$value), prob = rev(y$prob))
is.pmf(z, message_intro = "Checking z is valid PMF: ")
# Let d be the difference, d = x - y = x + z
d <- convolve.pmf(x, z, name1 = "x", name2 = "z")
is.pmf(d, message_intro = "Checking d is valid PMF: ")
# Prob(X > Y) = Prob(D > 0)
sum(d[d$value > 0, "prob"])
# [1] 0.9907902
Try it online!
Not that it matters practically if you're just after reasonable accuracy, since the above code runs in a fraction of a second anyway, but there is a shortcut to do the convolutions for the sum of 100 independent identically distributed variables: since 100 = 64 + 32 + 4 when expressed as the sum of powers of 2, you can keep convolving your intermediate answers with themselves as much as possible. Writing the subtotals for the first $i$ dice rolls as $T_i = \sum_{k=1}^{k=i}X_k$ we can obtain the PMFs of $T_2 = X_1 + X_2$, $T_4 = T_2 + T_2'$ (where $T_2'$ is independent of $T_2$ but has the same PMF), and similarly $T_8 = T_4 + T_4'$, $T_{16} = T_8 + T_8'$, $T_{32} = T_{16} + T_{16}'$ and $T_{64} = T_{32} + T_{32}'$. We need two more convolutions to find the total score of all 100 dice as the sum of three independent variables, $X = T_{100} = ( T_{64} + T_{32}'' ) + T_4''$, and a final convolution for $D = X + Z$. So I think you only need nine convolutions in all - and for the final one, you can just restrict yourself to the parts of the convolution giving a positive value for $D$. Or if it's less hassle, the parts that give the non-positive values for $D$ and then take the complement. Provided you pick the most efficient way, I reckon that means your worst case is effectively eight-and-a-half convolutions. EDIT: and as @whuber suggests, this isn't necessarily optimal either!
Using the nine-convolution method I identified, with the gmp package so I could work with bigq objects and writing a not-at-all-optimised loop to do the convolutions (since R's built-in method doesn't deal with bigq inputs), it took just a couple of seconds to work out the exact simplified fraction:
1342994286789364913259466589226414913145071640552263974478047652925028002001448330257335942966819418087658458889485712017471984746983053946540181650207455490497876104509955761041797420425037042000821811370562452822223052224332163891926447848261758144860052289/1355477899826721990460331878897812400287035152117007099242967137806414779868504848322476153909567683818236244909105993544861767898849017476783551366983047536680132501682168520276732248143444078295080865383592365060506205489222306287318639217916612944423026688
which does indeed round to 0.9907902. Now for the exact answer, I wouldn't have wanted to do that with too many more convolutions, I could feel the gears of my laptop starting to creak!
"Everyone knows" the sum of dots we should see in 100 dice rolls and that isn't going to happen; half the reason dice games exist.
When I was about 12, a teacher got the class to throw hundreds of dice and the result was very clear.
Numbers two and five were twice as likely as statistics said they should be. Before you deny that, try it!
Wait, though… Nos two and five? Don't you know several dice games that depend on sevens? Isn't that to say, on twos and fives?
– Robbie Goodwin Aug 27 '20 at 21:05