2

For example, can I take samples of size ten with replacement from a sample of size 100? I'm trying to teach my high school statistics students about bootstrapping and I want to use m and ms candy, but taking multiple samples of 100 from the original sample would take a very long time so I wanted to see if I can have them do 10 samples of the original sample of 100 with replacement. Is that still bootstrapping?

1 Answers1

6

Try it and see what happens.

set.seed(2023)
N <- 100
B <- 1000
n <- 10
x <- rnorm(N, 0, 1)
xbar100 <- xbar10 <- rep(NA, B)
for (i in 1:B){
xbar100[i] &lt;- mean(sample(x, N, replace = T))
xbar10[i]  &lt;- mean(sample(x, n, replace = T))

} sd(xbar100) # 0.0981310715978676 sd(xbar10) # 0.307451079371199

While this isn’t the best way to calculate bootstrap standard errors, the fact that bootstrapping with the original sample size gives a standard error close to the true value of $0.1$ while bootstrap sampling ten observations gives a standard error three times bigger shows the issues with this.

Perhaps a bigger demonstration of why you shouldn’t change the sample size would be seen by changing the code to give n <- 1000000 and shrink the standard error to almost zero.

Dave
  • 62,186