How to use boot() and boot.ci() to get a 95% CI for small samples

Question

I just learned about bootstrapping as a method for dealing with small samples (n<30), which is a major issue with my bioarchaeological data. Here is my code and output for bootstrapping a sample proportion (n=3) to get a 95% CI. Did I do it right?

> CrSA<-c(0,1,1)<br>
> CrSAmean<- function(x, d) {return(mean(x[d])) }
> boot(data=CrSA, statistic=CrSAmean, R=500)
ORDINARY NONPARAMETRIC BOOTSTRAP
Call:
boot(data = CrSA, statistic = CrSAmean, R = 500)
Bootstrap Statistics :
     original      bias    std. error
t1* 0.6666667 -0.01533333   0.2807337
> boot.mean<-boot(data=CrSA,statistic=CrSAmean,R=500)
> boot.ci(boot.out=boot.mean,type="norm")
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 500 bootstrap replicates
CALL : 
boot.ci(boot.out = boot.mean, type = "norm")
Intervals : 
Level      Normal

95%   ( 0.1284,  1.2116 )

Calculations and Intervals on Original Scale

The justification for the bootstrap is asymptotic. In small samples, it can exhibit problematic behavior and is therefore not a "remedy" for small samples. But it may be fine in your specific case. For more on this topic, see here or here. — COOLSerdash, Dec 02 '20 at 13:37

score 3 · Answer 1 · answered Dec 02 '20 at 14:20

3

If I understand correctly, you want to estimate the proportion of "successes" from $n=3$ trials. With such a small sample size, I'm not sure how bootstrap or any non-parametric method can be useful... What if, for example, your vector CrSA is [0, 0, 0] or [1, 1, 1]?

In my opinion, it would be better to ask which range of proportions is compatible with the observations assuming your vector of 1 and 0s come from a binomial distribution. For this you could use the binom.test function in R:

# 0 successes out of 3 trials:
binom.test(x= 0, n= 3)$conf.int
[1] 0.0000000 0.7075982
attr(,"conf.level")
[1] 0.95
1 success; 3 trials:
binom.test(x= 1, n= 3)$conf.int
[1] 0.008403759 0.905700676
attr(,"conf.level")
[1] 0.95

But obviously, with $n=3$ you get a very broad range.

answered Dec 02 '20 at 14:20

dariober

4,250

Question for dariober: I though I read somewhere that the binomial distribution was only valid for n>= 5; doesn't using the binom.test violate this 'requirement'? – stevebyers2000 Dec 02 '20 at 14:39
3

@stevebyers2000 the binomial distribution is defined ("valid") for any integer $n$ (number of trials) $\geq 0$ so $n=3$ is ok. Note also that I'm using binom.test only as a shortcut to get the CI of the proportion, I'm not testing any hypothesis. – dariober Dec 02 '20 at 14:50
Thank you, dariober. And thank you, COOLSerdash. I know now how to proceed. – stevebyers2000 Dec 02 '20 at 14:56
3

@stevebyers2000 we want a parametric approach specifically because the sample size is so small. These are called the "Clopper-Pearson" or "Exact" binomial confidence intervals. – AdamO Aug 01 '23 at 17:05

How to use boot() and boot.ci() to get a 95% CI for small samples

1 Answers1

1 success; 3 trials: