6

If I have a $k$ successes in $n$ bernoulli trials, does the parameter $p$ of the binomial distribution follow a well-known distribution? There are some methods to calculate confidence intervals for $p$, I'm interested in the distribution for the exact method.

Penz
  • 235

2 Answers2

14

From a bayesian point of view the distribution of p with k empirical successes and n trials is the Beta-Distribution, in detail $p\sim Beta(\alpha,\beta)$ with $\alpha=k+1$ and $\beta=n-k+1$. It represents the unnormalized density $prob(p|data)$, i.e. the unormalized probability that the unknown parameter is $p$ given the data (successes and trials) you have seen so far.

Edit: Let n be arbitrary but fixed. Then the posterior density can be derived via Bayes theorem $prob(p|k)=\frac{prob(k|p)*prob(p)}{prob(k)}\propto prob(k|p)\propto p^k(1-p)^{n-k}$. A uniform prior $prob(p)$ is assumed here, the normalizing constant $prob(k)$ is skipped since it does not depend on p. Hence "unnormalized". The distribution of $prob(p|k)$ given a fixed n (i.e. $prob(p|k,n)$) is the Betadistribution as specified above.

For example: The r-package binom uses the Betadistribution for calculating confidence intervals. See the methods biom.confint i.e. binom.bayes

steffen
  • 10,367
  • 1
    Good answer. I'm wondering what is "unnormalized" about the Beta distribution? Perhaps you're thinking of $p^\alpha(1-p)^\beta dp$ instead? – whuber Jul 19 '11 at 13:57
  • That's it! Is there a way to normalize it? – Penz Jul 20 '11 at 00:40
  • @whuber I updated my answer, I was thinking of the "derivation". – steffen Jul 20 '11 at 07:11
  • @Penz In general there is no need for manual normalization, e.g. for determination of confidence intervals you can use the already available inverse-cdf-functions like "qbeta" in R. – steffen Jul 20 '11 at 07:11
4

The sample proportion $\hat{p}=k/n$ has a scaled Binomial distribution. That is $k\sim\text{Binomial}(n,p)$ which is scaled by the sample size $n$. I don't think it has any other name.

Rob Hyndman
  • 56,782
  • My brain melts when I think about this. If k=3, n=3, then p=1. But this seem highly unlikely given Pr(p|k,n) is almost certainly non-zero for all p. Many p may have generated k and n. – user48956 Sep 30 '16 at 23:13
  • If you will write out the density/mass function of the so-called scaled binomial so much insides can be gained – Chamberlain Mbah Nov 12 '17 at 15:05