Estimate `n` given a sample from a binomial distribution with known `p`

Asked Oct 15 '22 at 18:18

Active Oct 15 '22 at 18:18

Viewed 20 times

I have a large list of $n$ distinct numbers drawn from a discrete uniform distribution of integers in $[0, M[$, with $n \ll M$. $M$ is known, but $n$ isn't. It's too expensive to read all $n$ numbers, but I can read the $k$ smallest numbers in the list for some $k \ll n$. I want to estimate $n$ from this sample.

Let's call those $k$ smallest numbers $x_1, ..., x_k$ with $x_1$ being the smallest and $x_k$ the largest. This implies that exactly $k$ numbers out of $n$ are less than or equal to $x_k$.

My first approach was to estimate $n \approx k \frac{M}{x_k}$. This (kind of) works, but in simulations with $n = 1000000$, the margin of error was larger than I'd like.

I then tried to think of the sample as a binomial distribution, with $p = \frac{x_k}{M}$ and an unknown $n$. My question is, how can I estimate $n$ knowing $k$ and $p$? Or, is there a better way to look at this problem?

asked Oct 15 '22 at 18:18

jfhr

e.g. optimize(function(x) -1*(lchoose(x,10) + 10*log(0.1) + (x-10)*log(0.9)), c(0, 10000)) – Ben Bolker Oct 15 '22 at 21:06

Estimate `n` given a sample from a binomial distribution with known `p`

0 Answers0