7

A six-sided die is rolled 100 times. Using the normal approximation, find the probability that the face showing six turns up between 15 and 20 times. Find the probability that the sum of the face values of the 100 trials is less than 300.

For the first part of the question, I did the following:

$P(15 \le X \le 20) = \sum_{15 \le i \le 20} C(100,i)(\frac{1}{6})^i(\frac{5}{6})^{100-i}$

Where X is the number of sixes rolled. My answer was about 0.56.

I have no idea how to do the second part. I know I have to do something like

$P(Y<300|N=100)$

Where Y is the sum and N is the number of times rolled. But I don't know the probability of the sum so I'm stuck.

Jeromy Anglim
  • 44,984
styfle
  • 917
  • Your method for the first part didn't use the normal approximation. If you re-do it using the normal approximation, it shouldn't be difficult to apply the same method to the second part. – mark999 May 28 '11 at 23:57
  • @mark999 Yes I recognize this. I don't know how to use normal approximation. – styfle May 29 '11 at 00:01
  • 2
    For the first part, if $X$ is binomial with parameters $n=100$ and $p = 1/6$, then $X$ is approximately normally distributed with mean $np$ and variance $np(1-p)$. You may want to use a "continuity correction": find the probability that the normal random variable is between 14.5 and 20.5. I think I was incorrect when I said that the second part is a simple application of the same method. – mark999 May 29 '11 at 00:11

2 Answers2

4

Due to the CLT, a sum of i.i.d. random variables is distributed:

$$ \sum_{i=1}^nX_i \sim N\left(\mu =n\cdot\mu_{X_i},\sigma^2 = n\cdot\sigma^2_{X_i}\right) $$

The mean of a single dice roll ($X_i$) is 3.5 and the variance is 35/12.

That should help you find the answer.

mpiktas
  • 35,099
Glen
  • 7,250
  • Where did you get Var(X_i) = 35/12? – styfle May 29 '11 at 00:18
  • 1
    You may also want to use the continuity correction here which corresponds to finding the probability the sum is less than 295.5. – Glen May 29 '11 at 00:21
  • For the variance calculation I would suggest writing out the PMF and using the formula Var(X)=E(X^2)-E(X)^2. – Glen May 29 '11 at 00:23
  • When using pnorm in R, I tried pnorm(300, 350, sqrt(35/12)) and got 1.017822e-188 as my answer. That doesn't seem right. – styfle May 29 '11 at 00:28
  • 1
    Your SD is not correct, forgot the n. It should be sqrt(100*(35/12)). – Glen May 29 '11 at 01:46
  • @Glen I got 0.384849 which seems much more reasonable. Thanks. – styfle May 29 '11 at 02:47
  • With a mean of 350 and a SD of $\approx 17$, at 300 you're about three SDs away from the mean, so the answer should be in the ‰ range. – Thies Heidecke May 29 '11 at 03:21
  • D'oh, I multiplied the SD by 100, not the variance! Ok let's try that again. I got 0.001707396. That seems really small. – styfle May 29 '11 at 07:32
  • I don't know where you got $0.384849$ from, as the "z-statistic" is about $-2.9$, which is about a $0.1\text{%}$ chance of being less than this. – probabilityislogic May 29 '11 at 07:35
  • 1
    @styfle - I was writing my comment as you wrote yours. What you may not be aware of is the factorials tend to "pile up" very quickly. In fact they often grow so fast $n!\approx O(n^{n})$, that you can often approximate a sum of "choose" functions by the largest one. You could also convince yourself of this by running some simulations. – probabilityislogic May 29 '11 at 07:38
  • I got 125/9 as variance from binomial distribution. – ralu May 31 '11 at 05:15
  • @ralu, the number that comes up on the throw of one die is not a binomial distribution – Glen May 31 '11 at 19:12
4

In the comments to Glen's answer you seem to have used a normal approximation pnorm(300, 350, sqrt(3500/12)) to get 0.001707396. This is not a bad answer, though you can do better.

If you used the continuity correction the continuity correction pnorm(299.5, 350, sqrt(3500/12)) you would get 0.001553355. I suspect this is what was being asked for.

It is in fact possible to calculate this more precisely. The following R code does so (yes, I know it has for loops).

sides <-  6   
throws <- 100 

## p[j,i] is probability of exactly (j+sides) after (i+1) throws 
p <- matrix(rep(0, sides*(throws+1)^2 ), ncol=throws+1 )
p[sides,1] <- 1 # probability 1 of score of 0 after 0 throws  

for (i in 2:(throws+1) ){
  for (j in (sides+1):(sides*(throws+1)) ){
     p[j,i] <-  sum(p[(j-sides):(j-1), i-1]) / sides
                                          } 
                            }
sum( p[0:(299+sides), throws+1] ) 

This gives the result 0.001505810.

The normal approximation with continuity correction is within 0.00005, which looks good, though the relative error is about 3%, which looks slightly less impressive; this often happens using the normal approximation in the tail of the distribution.

Henry
  • 39,459
  • I'm not afraid of for loops, it's everything else that throws me off :). What is this continuity correction everyone is talking about? According to Wikipedia, you add 1/2, not subtract it. I also don't understand the purpose of your code. You made a matrix that is 101 columns wide by 606 rows? – styfle May 29 '11 at 22:41
  • In the normal approximation you are treating the score as a continuous variable. So it might perhaps be $299.890123$ or $299.235711$. In the real problem you are restricted to integers and want less than $300$ or, equivalently, less than or equal to $299$. So it is sensible to cut the normal approximation between $299$ and $300$, say at $299.5$. – Henry May 29 '11 at 23:45
  • The code says ## p[j,i] is probability of exactly (j+sides) after (i+1) throws as a comment. This is for two reasons: R starts counting at $1$ rather than $0$; and I wanted to sum the earlier six values before the latest roll without having to fuss about references which were not there. – Henry May 29 '11 at 23:49
  • Oh yeah, R is not 0 indexed. It somehow makes it more complicated starting at 1. I guess I should run this code and try it out. – styfle May 30 '11 at 04:09