5

I'm studying confidence intervals, and I'm curious about how one might generate a confidence interval for the confidence interval, if that even makes sense.

For example, let's say I draw simple random samples of n=100 from some population, calculate sample means and standard deviations, and construct 95% confidence intervals. I repeat this procedure 100 times. I know I expect about 95 of these intervals to capture the population mean, and about 5 of them not to. However, can I construct a confidence interval around this expectation? If I were to repeat this entire "100 samples of 100 samples" over and over again, what can I say about the distribution of how often the intervals captures?

Essentially, could I construct a confidence interval for the confidence interval? Would that even make any sense?

Thanks!

  • In that situation, wouldn't you take the average of all confidence intervals to improve your estimate? As a theoretical answer to your question, you can always create a confidence interval as long as you can (1) make an assumption about the underlying distribution, (2) have a mean, (3) have a variance, (4) have a confidence level. – Jean-Paul Feb 07 '16 at 22:25
  • 1
    and it's turtles all the way down – rep_ho Feb 07 '16 at 23:50
  • I think what you essentially want to do is form a confidence interval for the coverage probability. The coverage probab9ility is the probability that any one confidence interval contains the population mean. – Mark L. Stone Feb 08 '16 at 01:44
  • (continued) Presuming you have M i.i.d replications of forming confidence intervals based on n = 100 sample size, you can form the confidence interval for coverage probability based on binomial fraction of intervals containing the population mean. – Mark L. Stone Feb 08 '16 at 01:50
  • Mark, that makes a ton of sense, and I think it's similar to Glen's answer below. Right? – Guy Davidson Feb 09 '16 at 20:23

1 Answers1

5

what can I say about the distribution of how often the intervals captures?

Treating each interval containing the parameter as a Bernoulli process with each trial having some coverage probability $p$, the number of "coverages" should be $\text{Binomial}(n,p)$.

The potential problem is whether the $p$ one actually has is really the $p$ one was hoping for (whether due to the extent of the failure of assumptions or because of approximations involved in obtaining the intervals).

Glen_b
  • 282,281
  • That makes sense. Would I be justified in using something like this? https://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval (looks like I can't line break in comments, which is a little awkward). – Guy Davidson Feb 09 '16 at 20:23
  • Which, if I wanted a 95% CI, would be 0.95 +/- 1.96 * sqrt(0.95 * 0.05 / 100) => 0.95 +/- 0.0427, which essential means about 90.5-100% of the time. Interesting! – Guy Davidson Feb 09 '16 at 20:33
  • 1
    @Guy I wouldn't use the normal approximation for that; with 100 intervals, even if all the assumptions held, the expected number outside is a small count (binomial but well approximated by a Poisson(5) ... so pretty discrete)... – Glen_b Feb 09 '16 at 23:19