For example, can I take samples of size ten with replacement from a sample of size 100? I'm trying to teach my high school statistics students about bootstrapping and I want to use m and ms candy, but taking multiple samples of 100 from the original sample would take a very long time so I wanted to see if I can have them do 10 samples of the original sample of 100 with replacement. Is that still bootstrapping?
Asked
Active
Viewed 87 times
2
-
2No. The main aim of the Bootstrap is to estimate uncertainty, which is massively being impacted by the sample size. Thus, it is important to keep the sample size fixed. – Michael M Dec 10 '23 at 14:26
-
1Does this answer your question? Can we use bootstrap samples that are smaller than original sample? – J-J-J Dec 10 '23 at 15:29
1 Answers
6
Try it and see what happens.
set.seed(2023)
N <- 100
B <- 1000
n <- 10
x <- rnorm(N, 0, 1)
xbar100 <- xbar10 <- rep(NA, B)
for (i in 1:B){
xbar100[i] <- mean(sample(x, N, replace = T))
xbar10[i] <- mean(sample(x, n, replace = T))
}
sd(xbar100) # 0.0981310715978676
sd(xbar10) # 0.307451079371199
While this isn’t the best way to calculate bootstrap standard errors, the fact that bootstrapping with the original sample size gives a standard error close to the true value of $0.1$ while bootstrap sampling ten observations gives a standard error three times bigger shows the issues with this.
Perhaps a bigger demonstration of why you shouldn’t change the sample size would be seen by changing the code to give n <- 1000000 and shrink the standard error to almost zero.
Dave
- 62,186