0

I have this bootstrap:

library(ggplot2)

n <- 30 set.seed(1) orig_mean <- 1 orig_sd <- 2 X <- rnorm(n, mean = orig_mean, sd = orig_sd) set.seed(NULL)

Bootstrapping

m_reps <- 20000 boot_means <- c() boot_vars <- c() for (i in 1:m_reps) { current_sample <- sample(X, n, replace = T) boot_means <- append(boot_means, mean(current_sample)) boot_vars <- append(boot_vars, var(current_sample)) } boot_means <- as.data.frame(boot_means) #for plotting boot_vars <- as.data.frame(boot_vars)

The central limit theorem tells us that those bootstrapped means $\bar{x}$ should be $\bar{X} \sim N\left(\mu, \frac{\sigma^2}{n} \right)$. We can see this with the plot of $\bar{X} \sim N\left(1, \frac{2^2}{30} \right)$ on top of the bootstrapped means' histogram:

ggplot(boot_means, aes(x = boot_means)) +
  geom_histogram(
    aes(y = after_stat(density)), bins = 20, color = "black", fill = "white") +
  stat_function(fun = dnorm, args = list(
    mean = orig_mean, sd = orig_sd/sqrt(n)))

The bell curve does not precisely line up with the histogram, but it is close.

I would like to do something like this with the variance. The variance histogram is:

ggplot(boot_vars, aes(x = boot_vars)) +
  geom_histogram(
    bins = 20, color = "black", fill = "white")

This is where I'm stuck. Is that a chi-squared distribution? And how do I set up a "target" curve like I did with the bootstrapped means?

What I'm really looking for is to figure out what the distribution is of bootstrapped variances. From there the coding should be fairly straightforward.

  • See the last sentence of the answer to the linked question for the answer for the type of distribution. Non-statistical coding-type questions are off-topic here, but you should be able to adapt the code you wrote for the distribution of sample mean values to the chi-square distribution associated with sample variances. – EdM Nov 11 '22 at 18:55
  • 1
    This answer might also be helpful. – EdM Nov 11 '22 at 19:01

0 Answers0