3

I just learned about bootstrapping as a method for dealing with small samples (n<30), which is a major issue with my bioarchaeological data. Here is my code and output for bootstrapping a sample proportion (n=3) to get a 95% CI. Did I do it right?

> CrSA<-c(0,1,1)<br>
> CrSAmean<- function(x, d) {return(mean(x[d])) }
> boot(data=CrSA, statistic=CrSAmean, R=500)

ORDINARY NONPARAMETRIC BOOTSTRAP

Call: boot(data = CrSA, statistic = CrSAmean, R = 500)

Bootstrap Statistics : original bias std. error t1* 0.6666667 -0.01533333 0.2807337

> boot.mean<-boot(data=CrSA,statistic=CrSAmean,R=500) > boot.ci(boot.out=boot.mean,type="norm") BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS Based on 500 bootstrap replicates

CALL : boot.ci(boot.out = boot.mean, type = "norm")

Intervals : Level Normal
95% ( 0.1284, 1.2116 )

Calculations and Intervals on Original Scale

  • 9
    The justification for the bootstrap is asymptotic. In small samples, it can exhibit problematic behavior and is therefore not a "remedy" for small samples. But it may be fine in your specific case. For more on this topic, see here or here. – COOLSerdash Dec 02 '20 at 13:37

1 Answers1

3

If I understand correctly, you want to estimate the proportion of "successes" from $n=3$ trials. With such a small sample size, I'm not sure how bootstrap or any non-parametric method can be useful... What if, for example, your vector CrSA is [0, 0, 0] or [1, 1, 1]?

In my opinion, it would be better to ask which range of proportions is compatible with the observations assuming your vector of 1 and 0s come from a binomial distribution. For this you could use the binom.test function in R:

# 0 successes out of 3 trials:
binom.test(x= 0, n= 3)$conf.int
[1] 0.0000000 0.7075982
attr(,"conf.level")
[1] 0.95

1 success; 3 trials:

binom.test(x= 1, n= 3)$conf.int [1] 0.008403759 0.905700676 attr(,"conf.level") [1] 0.95

But obviously, with $n=3$ you get a very broad range.

dariober
  • 4,250
  • Question for dariober: I though I read somewhere that the binomial distribution was only valid for n>= 5; doesn't using the binom.test violate this 'requirement'? – stevebyers2000 Dec 02 '20 at 14:39
  • 3
    @stevebyers2000 the binomial distribution is defined ("valid") for any integer $n$ (number of trials) $\geq 0$ so $n=3$ is ok. Note also that I'm using binom.test only as a shortcut to get the CI of the proportion, I'm not testing any hypothesis. – dariober Dec 02 '20 at 14:50
  • Thank you, dariober. And thank you, COOLSerdash. I know now how to proceed. – stevebyers2000 Dec 02 '20 at 14:56
  • 3
    @stevebyers2000 we want a parametric approach specifically because the sample size is so small. These are called the "Clopper-Pearson" or "Exact" binomial confidence intervals. – AdamO Aug 01 '23 at 17:05