2

The authors get the Monte Carlo standard deviation of the empirical standard deviation as follows.

enter image description here

My question is how to get the MCSE of the EmpSE?


For bias, the MCSE of $\frac{1}{n}\sum \hat{\theta}_i-\theta$ is just $$ \sqrt{Var[\frac{1}{n}\sum \hat{\theta}_i-\theta]}=\sqrt{\frac{1}{n}Var[\hat{\theta}_1]} $$ The estimate of the variance is just $\frac{1}{n-1}\sum(\hat{\theta}_i-\bar{\theta})^2$ where $\bar{\theta}=\frac{1}{n}\sum \hat{\theta}_i$.

But how to use this idea to get MCSE of EmpSE?

dipetkov
  • 9,805
Hermi
  • 717
  • 3
  • 12

1 Answers1

3

Please include proper citations. Online libraries make this very easy; for example Wiley has a citation tool (next to the wrench icon) which generates the following:

Morris, TP, White, IR, Crowther, MJ. Using simulation studies to evaluate statistical methods. Statistics in Medicine. 2019; 38: 2074–2102. https://doi.org/10.1002/sim.8086

The Monte Carlo simulations described in the paper are implemented in the simsum user-written command in Stata and the rsimsum package in R.


Let's start with notation: We want to analyze the statistical properties of a proposed study (bias, power, etc.) so we simulate $n_{\text{sim}}$ replicates of a dataset and its analysis. The $i$th simulation produces an estimate $\hat{\theta}_i$ of a parameter $\theta$.

The Monte Carlo (MC) error formulas assume that the $\hat{\theta}_i$s are normally distributed [1], [2].

$$ \begin{aligned} \hat{\theta}_i \sim \operatorname{Normal}\left(\theta+\operatorname{bias}, \operatorname{Var}(\hat{\theta})\right) \end{aligned} $$

We'll skip estimating the bias because your question is about the precision of the estimator $\hat{\theta}$, ie. the empirical standard error:

$$ \operatorname{EmpSE} = \sqrt{\operatorname{Var}(\hat{\theta})} $$

The estimate of the empirical SE is:

$$ \widehat{\operatorname{EmpSE}} = \sqrt{\widehat{\operatorname{Var}}(\hat{\theta})} $$

This looks like a tautology; however, we don't know $\operatorname{Var}(\hat{\theta})$ but we can calculate $\widehat{\operatorname{Var}}(\hat{\theta})$ from the simulations. It's the Estimate in the second row of Table 6.

The Monte Carlo error (squared) of the estimate of the empirical SE is:

$$ \operatorname{Var}\left(\widehat{\operatorname{EmpSE}}\right) = \operatorname{Var}\left(\sqrt{\widehat{\operatorname{Var}}(\hat{\theta})}\right) $$

That's right: We want to estimate the variance of a standard error estimator. The formula simplifies considerably because of the normality assumption.

Since the $\hat{\theta}_i$s are normal:

$$ \begin{aligned} \frac{(n_{\text{sim}}-1)\widehat{\operatorname{Var}}(\hat{\theta})}{\operatorname{Var}(\hat{\theta})} \sim \chi^2_{n_{\text{sim}}-1} \end{aligned} $$

This should look familiar: Why is the sampling distribution of variance a chi-squared distribution? In a random sample $x_1,\ldots,x_n$ from a $\operatorname{Normal}(\mu,\sigma^2)$ distribution, $(n-1)s^2/\sigma^2 \sim \chi^2_{n-1}$ where $s^2$ is the sample variance.

We know that the $\chi^2_{n_{\text{sim}}-1}$ distribution has mean $(n_{\text{sim}}-1)$ and variance $2(n_{\text{sim}}-1)$. We can use these properties to show that:

$$ \operatorname{Var}\left(\widehat{\operatorname{EmpSE}}\right) \approx \frac{\operatorname{Var}(\hat{\theta})}{2(n_{\text{sim}}-1)} \approx \frac{\widehat{\operatorname{Var}}(\hat{\theta})}{2(n_{\text{sim}}-1)} $$

Take the square root to derive the formula for the Monte Carlo error of $\widehat{\operatorname{EmpSE}}$. It contains $\widehat{\operatorname{EmpSE}}$!

[1] Morris, TP, White, IR, Crowther, MJ. Using simulation studies to evaluate statistical methods. Statistics in Medicine. 2019; 38: 2074–2102. https://doi.org/10.1002/sim.8086
[2] White IR. Simsum: Analyses of Simulation Studies Including Monte Carlo Error. The Stata Journal. 2010;10(3):369-385. https://doi.org/10.1177/1536867X1001000305


Here is my attempt to show that $\operatorname{Var}\left(\widehat{\operatorname{EmpSE}}\right) \approx \operatorname{Var}(\hat{\theta}) \left/ [2(n_{\text{sim}}-1)] \right. $.

I start by simplifying the notation: Let's show that $\operatorname{Var}\{S\} \approx \sigma^2 \left/ [2(n-1)] \right. $ where $S$ is the sample standard deviation of $n$ iid $\operatorname{Normal}(\mu,\sigma^2)$ random variables.

  1. Multiply and divide by $\sigma^2/(n-1)$. $$ \begin{aligned} \operatorname{Var}\left\{S\right\} = \operatorname{Var}\left\{\left(\frac{(n-1)S^2}{\sigma^2}\right)^{1/2}\right\} \frac{\sigma^2}{n-1} \end{aligned} $$ We know that $(n-1)S^2/\sigma^2$ is $\chi^2_{n-1}$ with variance $2(n-1)$. What can we say about the variance of $\left((n-1)S^2/\sigma^2\right)^{1/2}$?

  2. Use the delta method to approximate the variance of $f(X) = X^{1/2}$. Not being strong at theory, I looked this up on the Wikipedia page about the Variance. $$ \begin{aligned} \operatorname{Var}\left\{X^{1/2}\right\} \approx \operatorname{Var}\left\{X\right\}\left(\frac{1}{2\left(\operatorname{E}X\right)^{1/2}}\right)^2\\ \end{aligned} $$

  3. Plug in and simplify. In our case $X = (n-1)S^2/\sigma^2$ with mean $(n-1)$ and variance $2(n-1)$. $$ \begin{aligned} \operatorname{Var}\left\{S\right\} \approx 2(n-1)\left(\frac{1}{2(n-1)^{1/2}}\right)^2\frac{\sigma^2}{n-1} = \frac{\sigma^2}{2(n-1)} \end{aligned} $$

dipetkov
  • 9,805
  • Thanks! But why do you think The Monte Carlo (MC) error formulas assume that the $\hat{\theta}_i$ are normally distributed is an assumption? Can we get asymptotic normal from the Central limit theorem? – Hermi Aug 02 '22 at 01:37
  • It's a simplifying assumption. Reference [2] mentions another approach (doesn't rely on the CLT). The CLT is not particularly relevant. It's easy to make $n_{\text{sim}}$ large but it's the sample size $n_{\text{obs}}$ which matters for the $\theta_i$s. Asymptotics won't make the $\theta_i$s Normal unless $n_{\text{obs}}$ is large. – dipetkov Aug 02 '22 at 06:13
  • This doesn't mean of course that $n_{\text{obs}}$ has to be large for the $\theta_i$s to be reasonably Normal. It means that you can't compensate for planning for a small $n_{\text{obs}}$ in the experiment with a large $n_{\text{sim}}$ in the MC simulation. – dipetkov Aug 02 '22 at 06:13