0

During capture-recapture sampling, we aim to estimate a population size (e.g. of organisms) by capturing a sample of size $ n_1 $, marking them, releasing them, then re-sampling (assuming they have mixed) with size $ n_2 $ and counting how many are marked. If we find that $ m $ out of $ n_2 $ are marked, then we estimate the population size by simply equating proportions as $ \frac{n_1}{N} = \frac{m}{n_2} $ so $ N = \frac{n_1 n_2}{m} $. This technique is good because the estimate of $ N $ is independent of any population growth that may occur between samples.

I'm interested in looking at exactly how likely it is that the true population is $ N $, or within some confidence interval. I want to estimate the distribution of $ N $, viewed as a random variable.

I have learned that if we observe $ \alpha - 1 $ 'successes' and $ \beta - 1 $ 'failures' in a binomial model, then the probability $ p $ of observing a success has a Beta distribution: $ p \sim \text{Beta}(\alpha, \beta) $, which is what you get if you apply Bayes' rule to the binomial distribution, and that the number of successes has a Beta-Binomial distribution: $ m \sim \text{BetaBin}(n, \alpha, \beta) $. If this understanding is wrong please correct me.

My question is, is it possible to estimate the expectation and variance of the full population by doing this:

  • We know the values of $ n_1 $, $ n_2 $ and $ m $.

  • The prevalence of marked organisms is $ p \sim \text{Beta}(m + 1, n_2 - m + 1) $

  • The conditional distribution of the initial number of marked organisms is $ n | N \sim \text{BetaBin}(N, m + 1, n_2 - m + 1) $.

  • By Bayes' rule, $$ P(N = N | n = n_1) = \dfrac{P(N = N)}{P(n = n_1)} \times P(n = n_1 | N = N) \\ = \dfrac{P(N = N)}{\sum_{i=0}^{\infty} P(n = n_1 | N = i) P(N=i)} \times P(n = n_1 | N = N) \\ = \dfrac{P(n = n_1 | N = N)}{\sum_{i=0}^{\infty} P(n = n_1 | N = i) } $$

  • We now have the distribution for the population and can compute the mean and variance.

Is this even close to a valid method? I am not competent or confident with the maths here and am interested to see how it is done. Thankyou.

Nick_2440
  • 103
  • Your samples from the population are without replacement, so a beta-binomial doesn't seem appropriate. See this answer for a possible approach. Instead of "words", you have organisms, but otherwise the setting seems to be the same. You will have: $s_i=n_i$, $k_1=n_1$, $k_2=n_2-m$, $n=N$. – jblood94 Mar 13 '23 at 18:53
  • I wasn't aware that beta binomial model required replacement - so it's not like the binomial in that aspect? Thanks for the link though I'll check it out. – Nick_2440 Mar 14 '23 at 02:52
  • Each time an organism is taken from the population for sampling without replacement, it changes the probability of sampling a marked one for subsequent samples. The effect depends on the size of $n$ and $m$--if they are large, the effect may be small enough to ingore. – jblood94 Mar 14 '23 at 10:20

0 Answers0