2

I am running an ANOVA and computing bootstrapped CIs for my effect sizes. I am using for this the Measures of Effect Size Toolbox for Matlab.

To my surprise, if I re-run the same line of code, i.e. compute the same thing for the same data, the lower&upper margins of the CI are slightly different each time.

I know that bootstrapped CIs are built from permutation analyses based on random generations of distributions. Still, I do not remeber reading anywhere this odd property of CIs of being never twice the same!

z8080
  • 2,370
  • 1
    You can all but ensure that your bootstrap CIs agree exactly to an arbitrary precision (e.g., 1 significant figure, two significant figures, etc.) by selecting a larger number of bootstrap samples to estimate from. – Alexis Nov 17 '19 at 16:28
  • Thanks! I guess the answers that were put in attest to this also. Thanks again. – z8080 Nov 17 '19 at 16:54

2 Answers2

6

A bootstrap resample consists of data points $(x_1^*, x_2^*, \cdots x_n^*)$ which are sampled with replacement from the original data $(x_1, x_2, \cdots x_n)$. Technically, a Bootstrap procedure should consider all of the possible bootstrap samples. If this can be accomplished, then the Bootstrap confidence interval will be the same for every run.

Unfortunately, for a data set of size $n$, there are $n^n$ possible bootstrap samples (e.g. 10 billion when $n=10$) which is prohibitively large. To account for this, we usually randomly choose $M$ of the possible $n^n$ bootstrap samples. As $M$ gets large, the confidence interval generated here will converge to the "true" bootstrap CI, as if we had used all $n^n$ possible resamples.

If you want to see more consistent results across different runs, you can set the seed (as @Dave suggests) or try increasing the number of resamples. The latter approach will lead to a more expensive procedure, but will be less sensitive the random nature of the bootstrap in practice.

knrumsey
  • 7,722
2

Each time you repeat the process, you are taking a different set of samples. Those different samples will give slightly different results.

If you set a random seed, you will get the same results every time (which is why random seeds are useful). I don’t know the Matlab command for it, but R is set.seed(2019) and Python is random.seed(2019) after you import the random library.

Dave
  • 62,186