Questions tagged [bootstrap]

The bootstrap is a resampling method to estimate the sampling distribution of a statistic.

The bootstrap is a technique to estimate the sampling distribution of a statistic, based on resampling from a dataset, estimating the parameters from the re-sampled data, and comparing those estimates to the (known) values for the dataset itself. There are many variants of bootstrapping used in specialized analyses.

For an extensive review of the bootstrap see:

  • Horowitz, J.L. (2001) "The Bootstrap", Handbook of Econometrics, in: J.J. Heckman & E.E. Leamer (eds.), Handbook of Econometrics, Edition 1, Vol. 5, Chapter 52, pp.3159-3228
  • Efron, B. and Tibshirani, R.J. (1994) "An Introduction to the Bootstrap", Chapman & Hall/CRC Monographs on Statistics & Applied Probability
1968 questions
145
votes
5 answers

What is the .632+ rule in bootstrapping?

Here @gung makes reference to the .632+ rule. A quick Google search doesn't yield an easy to understand answer as to what this rule means and for what purpose it is used. Would someone please elucidate the .632+ rule?
russellpierce
  • 18,599
69
votes
6 answers

Why on average does each bootstrap sample contain roughly two thirds of observations?

I have run across the assertion that each bootstrap sample (or bagged tree) will contain on average approximately $2/3$ of the observations. I understand that the chance of not being selected in any of $n$ draws from $n$ samples with replacement is…
xyzzy
  • 983
  • 2
  • 8
  • 7
10
votes
1 answer

What is iterative bootstrap? How is it used?

I have recently stumbled across a mention of "double/triple bootstrap" or "iterative bootstrap". As I understand, each bootstrap sample is bootstrapped again. What is the point? How is it used?
Max
  • 281
10
votes
2 answers

Can I use bootstrapping, why or why not?

I am currently working on biomass estimates using satellite imagery. I'll quickly define the background of my question, and then explain the statistical question I am working on. Background Problem I am trying to estimate biomass over an area in…
Thomas C.
  • 101
9
votes
2 answers

Adaptively selecting the number of bootstrap replicates

As with most Monte Carlo methods, the rule for bootstrapping is that the larger the number of replicates, the lower the Monte Carlo error. But there are diminishing returns, so it doesn't make sense to run as many replicates as you possibly…
Kodiologist
  • 20,116
9
votes
1 answer

Subsample bootstrapping

I have been working on the uncertainty associated with a quantity calculated from a Monte Carlo project. Normally I would use the bootstrap method by resampling with replacement, for a couple of technical reasons that is not particularly easy here.…
Bowler
  • 1,191
9
votes
1 answer

Bootstrapping with a small number of observations

Let's say I've collected a small number (N) of observations for a hypothesis that I'd like to test. I could use the bootstrap method to produce a sample distribution for the mean result of N observations, but I'm concerned that this model could…
G__
  • 193
8
votes
1 answer

Bootstrap method- downsides

Can you tell me when doesnt the bootstrap method work? I know that could be outliers, but is there any particular distribution when it doesn't work?
Odina
  • 81
  • 2
7
votes
2 answers

Why Bootstrapping standard errors and 95% confidence intervals change each time I re-conducted the analysis

I was using respondent-driven sampling analysis tool (RDSAT) to get bootstrapping confidence intervals. But each time I re-did the analysis, I noticed the bootstrapping standard errors and confidence intervals changed a little bit. Is this normal…
Sophiex
  • 71
7
votes
3 answers

Why does non-parametric bootstrap not return the same sample over and over again?

Why does non-parametric bootstrap not return the same sample over and over again? My notes write: Assume data $X_1,...,X_n$. Sample data with replacement to produce $X_1^{(p)},...,X_n^{(p)}$ Now since both are length $n$, then how does this not…
mavavilj
  • 4,109
7
votes
1 answer

Bootstrapped regression with total data or bootstrap with matched data?

I'm investigating the effect of a continuous variable A on a measurement variable M stratified by another factor variable C in an observational dataset. Due to heteroscedasticity I decided to use a bootstrapped regression analysis. However looking…
Misha
  • 1,323
7
votes
1 answer

"Smoothness" of a statistic for bootstrapping?

I was wondering if anyone could explain what is meant by saying a statistic is not 'smooth'. For example, in 2.6.2 p. 41 of Davison and Hinkley, they talk about statistics that "depend on the sample in an unsmooth or unstable way such that…
Bee
  • 103
  • 6
7
votes
1 answer

Are there problems with arbitrary application of bootstrap?

Suppose I have a statistics (say a price index) and I want to obtain standard errors for it. I have heard that blind application of bootstrap may not be a good practice. If true 1- What could go wrong if I just apply nonparametric bootstrap and…
user41838
  • 531
  • 2
  • 7
6
votes
2 answers

Why bootstrapping?

I understood that bootstrapping is a technique used to estimate statistics of a population. In bootstrapping we take many samples of chosen size, estimate statistics and obtain the mean of these statistics. This mean is representative of the whole…
vamsi
  • 211
5
votes
1 answer

Through an example: what is parametric and non-parametric boostrap?

I cannot believe how abstractly some sources explain this, practically not explaining it at all. So what's parametric and non-parametric bootstrap and how are they different?
mavavilj
  • 4,109
1
2 3 4 5 6