0

Suppose I want to use Monte Carlo to compute some probability $p$. A single MC simulation will run for $R$ iterations and calculate $p$ as the fraction of 'successes'.

Say I want to compute $p$ within an error of $E$ with a 95% confidence interval. That is, I want to find $R_0$ such that if I run the MC simulation for $R_0$ many times and obtain $p_0$, then I am $95\%$ confident that the true $p$ lies in $[p_0 - E, p_0 + E]$.

I found two possible formulas for this: one and two but they are different (albeit similar), and they also don't really seem to take $R$ into account, which doesn't make intuitive sense to me.

For instance, the second link has the formula:

$$\bigg(\frac{z_{\alpha/2} \cdot \text{std}(p)}{E}\bigg)^2$$

$\text{std}(p)$, I assume will be computed by Monte Carlo sampling $p_1, \dots, p_n$ (with some fixed $R$ iterations for each $p_i$), and the finding the standard deviation of the $p_i$. But naturally this standard deviation would decrease as $R$ increases. So it seems to me that the formula should factor in $R$ somehow, which it isn't.

Is my interpretation incorrect?

Is there a simple formula to determine number of simulations required?

Hullo
  • 9
  • 2
  • Welcome to Cross Validated! Is this anything more than finding the sample size required to detect an effect of $E?$ // If you use R software, the pwr package has some nice functions for doing work like this. // Are you sure you want to discuss the margin of error in additive terms? While people do this all the time, if you observe a probability of $0.1$, a margin of error of $0.05$ is a lot bigger deal than if you observe a probability of $0.5$. – Dave Nov 12 '22 at 20:01
  • I'm using python - anything in scipy or numpy or statsmodels?. I'm not sure what "effect of $E$ means though. And yeah, it is additive but $E = 0.001$ and $p~ 0.55$ so it should be fine. It's a hypothetical/conceptual question, I don't intend to actually run the simulation for that long (I assume it would take quite long for such small errors) – Hullo Nov 12 '22 at 20:13
  • $R$ is the number of iterations, which is the quantity you aim to calculate. – Dave Nov 12 '22 at 23:30
  • How would I calculate $std(p)$ then? I have no knowledge of the underlying dist. – Hullo Nov 12 '22 at 23:32
  • You know it’s a Bernoulli distribution. The variance of a Bernoulli distribution is $p(1-p)$. – Dave Nov 12 '22 at 23:41
  • I know that if $X$ is Bernoulli with probability $q$, then $Var[X] = q(1-q)$. But I don't know the distribution that governs the value of $p$ itself. I'm not estimating $X$, I'm estimating $p$. Perhaps I'm misunderstanding something. – Hullo Nov 12 '22 at 23:43
  • $p$ is a fixed value, not a distribution, in the frequentist statistics that you are using when you calculate a confidence interval. – Dave Nov 12 '22 at 23:49
  • Maybe I'm mixing up terminology or something. I estimate a probability $p$ by doing any $n$ draws and calculating the fraction of successes, call it $p_n$ (I called this process of doing $n$ draws a MC simulation. Maybe that's wrong). I need to know the size of $n$ so that $|p - p_n| \leq E$ with $95%$ confidence. I don't know what $p$ is. – Hullo Nov 12 '22 at 23:52
  • If you’re concerned about mixing up terminology, perhaps focus on the task you’re trying to accomplish. Once you do that, we can develop the correct statistical terminology and solution. // That said, I think you have the answer, so you can go calculate the way your reference says to and get your confidence interval. If you want to check how well it works, repeat the process $100$ times and see if about $95$ confidence intervals contain the true value. – Dave Nov 13 '22 at 00:01

0 Answers0