0

I have been reading about bootstrapping, and sampling distributions, and find it odd that people use these techniques to describe uncertainty.

As I understand it, the sampling distribution shows uncertainty in the statistic measured on a sample by resampling the sample with replacement.

So if you bootstrap your sample, what you're really getting is a statistic of your statistic, with an uncertainty bound.

But surely we only care about the original statistic because it's an estimate of the population parameter. We want to know the original statistic's uncertainty bound, because it tells us something about the population. So, why do we care about the uncertainty bound of a statistic of the original statistic?

How does that tell us something about the population parameter?

There is a lay-person answer to this question already here:

Explaining to laypeople why bootstrapping works

My question is an extension of the above. I would like to know at a deeper mathematical level how this works, preferably with empirical evidence.

Connor
  • 625
  • Bootstrapping is a way of getting a sampling distribution of your statistic, particularly when there is no analytic solution to this. – Peter Flom Dec 06 '23 at 13:20
  • Does this answer your Q: https://stats.stackexchange.com/questions/26088/explaining-to-laypeople-why-bootstrapping-works – kjetil b halvorsen Dec 06 '23 at 13:24
  • @PeterFlom indeed, but why bother? That's my question. Apologies if it's unclear, I can restate later today if so. – Connor Dec 06 '23 at 13:59
  • @kjetilbhalvorsen partially. I want a deeper answer than that given in that question and answer though. I've edited my question to reflect this. Let me know if it's acceptably different. If not, I'll rethink and restate. – Connor Dec 06 '23 at 13:59
  • 1
    Why bother? Well, because it's one of the key things to find out. You want a parameter estimate, and you want some sort of sense of how good your estimate is. Sometimes, there's no analytic solution. – Peter Flom Dec 06 '23 at 14:19
  • @PeterFlom So, you're estimating the uncertainty in your statistic, in the hope this gives you a realistic bound on your population parameter? I can understand that idea, but I'd still like to see it explained theoretically or proven empirically. I can't find any resources though, do you know of any? – Connor Dec 06 '23 at 14:44
  • 2
    Any book on bootstrapping with Efron among the authors will give you the mathematical background. A good keyword for searching is "plug-in estimator." – whuber Dec 06 '23 at 14:54
  • 1
    Thank you! @whuber – Connor Dec 06 '23 at 15:21
  • 1
    One of the comments I liked when I first learned about bootstrap was, “When you can’t go back to the true distribution, a representative empirical distribution is the next-best option.” – Dave Dec 07 '23 at 00:21
  • "So if you bootstrap your sample, what you're really getting is a statistic of your statistic, with an uncertainty bound... So, why do we care about the uncertainty bound of a statistic of the original statistic?" No -- you're getting an estimated uncertainty bound on the original statistic. You're not getting an uncertainty bound on some 2nd statistic. Put another way: there is 1. the true pop parameter, and 2. the true standard error from the sampling distribution for your statistic across many samples. 1. Your original-sample stat estimates the parameter, and 2. bootstrap estimates the SE. – civilstat Dec 07 '23 at 01:06

0 Answers0