9

Let's say I've collected a small number (N) of observations for a hypothesis that I'd like to test. I could use the bootstrap method to produce a sample distribution for the mean result of N observations, but I'm concerned that this model could break down when N gets very small, introducing error into the sample distribution itself.

So my question is, how can I determine what the minimum N is that I need for reasonable results; or more quantitatively, how is N tied to the sampling error as N->0?

Update: I am coming to understand that the minimum value for N will vary based on the nature of the underlying data. So, in this case what meta-observations can I make to help me determine this? I don't know the true underlying distribution, or else I wouldn't need to bootstrap.

G__
  • 193
  • 1
    I've seen an interesting comment in Prof. Wasserman's lecture notes at http://www.stat.cmu.edu/~larry/=stat705/Lecture13.pdf. The notation next to equation (21) on p. 6 suggests that the error you're concerned with falls off as 1/sqrt(n). Unfortunately, I don't know anything about the constant coefficient. – max Mar 26 '12 at 07:18

1 Answers1

7

There is not a straightforward answer to this, as it will always depend on both the true distribution of your data (imagine the degenerate case where the only value allowed is 1: then a bootstrap from a sample of size 1 will be as good as anything!) and the statistic you are going to calculate: some statistics will have more trouble recovering from a small sample size than others (imagine a resampling of an extreme outlier).

So: you're going to have to be more specific than what you've given us thus far.

Nick Sabbe
  • 12,819
  • 2
  • 37
  • 47
  • 1
    Can one make inferences about the true distribution based on the observations, perhaps using the variance of the observations? The extreme outlier case is tough, but if you've seen one then that carries a lot of information. If we revise the question to specify N > 2, then already the second observation tells us something if N1=N2 vs N1 != N2 (and what the difference between them is). – G__ Aug 16 '11 at 13:57
  • Bootstrapping of extremes do nor work, period. – kjetil b halvorsen Feb 16 '18 at 15:18