2

Suppose an AI player in a game can either win or lose. I wish to estimate the win ratio of this player. My question is, how many samples (games) needed in order to get an error smaller than 1%?

A friend explained me that Hoeffding inequality is the right approach, however in a similar question Sample size needed to estimate probability of “success” in Bernoulli trial the answer didn't mentioned Hoeffding inequality. I also found this Sample Size Calculator which might be the right tool for this problem however I could not understand how to use it.

Hoeffding inequality: Assume I want to know in 95% certainty that the win proportion is X ±1% than,

$$ P(H(n) \leq k) = \sum_{i=0}^k {n\choose i} p^iq^{n-i} $$ For $ k=(p-\epsilon)n$, $$ P(H(n) \leq (p-\epsilon)n) \leq e^{-2\epsilon^2n}$$ $$ P(H(n) \geq (p+\epsilon)n) \leq e^{-2\epsilon^2n}$$ Thus, $$ (p-\epsilon)n \leq P(H(n) \leq (p+\epsilon)n) \geq 1-2e^{-2\epsilon^2n}$$ For $ \epsilon = 0.01$ and $ 95\% $ certainty $$ 95\% \geq 1-2e^{-0.0002n}$$ which gives $ n\geq 18,444 $

This means that in order to estimate the win proportion with error smaller than 1%, in 95% of the times, 18444 samples are needed.

Is that right? Is Hoeffding inequality the best approach here? is it tight? Can some other bound / inequality give this certainty with less samples? If I know that the win rate is not close to 50% but 60±5%, would that change the answer?

Cohensius
  • 469
  • 4
  • 17
  • 1
    One approach is to note that a 95% CI for $p$ is of the form $\hat p \pm 1.96\sqrt{\frac{\hat p(1 - \hat p)}{n}},$ where $\hat p = X/n$ and $X$ is the number of wins in $n$ games. You want the margin of error $M = 1.96\sqrt{\frac{\hat p(1 - \hat p)}{n}} = 0.01.$ If you have a rough idea of the value of $p,$ then plug that value as $\hat p$ into the formula for $M$ and solve for $n.$ If you have no idea, then use $\hat p = 1/2,$ because that gives the largest size $n$, This formula is based on a normal approx, but if you want $M = .01,$ then your $n$ will be large enough to use norm aprx. – BruceET Nov 10 '18 at 23:01
  • Let me see if I get it right, solving $ 1.96\sqrt{ \hat{p} (1-\hat{p}) /n } = 0.01$ gives $ n = 9,604$

    It means that in order to get 95% certainty that the win ratio is $\hat{p} \pm 1%$ its enough to sample ~10000 games, not ~18500, as suggested by Hoeffding inequality?

    – Cohensius Nov 10 '18 at 23:20
  • 1
    That sounds right. – BruceET Nov 10 '18 at 23:30

1 Answers1

2

Let's simulate the experiment a million times in R. Suppose $p = .4.$ If we look at $n = 10,000$ games at each iteration, then what results do we get for $\hat p?$

n = 10^4; p = .4
p.hat = rbinom(10^6, 10^4, .4)/n
quantile(p.hat, c(.025, .975))
  2.5%  97.5% 
0.3904 0.4096 
summary(p.hat)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.3742  0.3967  0.4000  0.4000  0.4033  0.4245 

In 95% of cases, $\hat p$ was between 0.39 and 0.41, as required. The worst results in a million iterations were a minimum of 0.374 and a maximum of 0.425. Looking at the quartiles, we see that half of the $\hat p$'s were considerably closer than required --- between 0.397 and 0.403.

BruceET
  • 56,185