I am working with an application where I have a grid of cells, and I calculate "concentrations" in each cell by randomly placing a number of "particles" across the grid, and counting the number of particles in each cell. I am then interested in finding the number of cells with concentration less than some threshold concentration.
In the simplest case, I have $N_p$ particles and $N_c$ cells, and the particles are placed randomly with equal probability of being placed in any cell. Then the number of particles in the different cells can be drawn from a multinomial distribution, with $N_p$ trials and $N_c$ different outcomes, all with probability $1/N_c$. Using python I can for example create a realisation like this:
C = np.random.multinomial(Np, pvals=np.ones(Nc)/Nc)
where C is then an array with Nc elements that sum to Np. In my application, each element of C represents the concentration in a cell.
I am then interested in the number of cells with concentrations less than some threshold concentration $C_{lim}$, as a function of $C_{lim}$. It is of course easy to find the result numerically, and I have also found that I can approximate the result quite well with the cumulative distribution of a Gaussian with mean $N_p/N_c$ and variance $N_p p (1-p)$. This makes sense to me as the concentration in each cell is a random variable with mean $N_p/N_c$ and variance $N_p p (1-p)$, and I guess the Central Limit Theorem might be relevant somehow. An example is shown in the figure below.
Now we finally get to my question: Can I find a similar analytical approximation for the number of cells with concentration $C < C_{lim}$ in the case where particles are not distributed with the same probability for each cell? As an example, I have created the figure below, where the particle positions are still drawn from a multinomial distribution, but with higher probability of being placed in the cells in the center. In the example below I have used a Gaussian PDF to calculate the probabilities of the different cells based on their distance from the center (and then the probabilities were normalised to sum to 1), but I am also interested in the more general case where I don't have a nice analytical expression for the probabilities (but they can for example be evaluated numerically from a simulation).
Any answers, hints, or suggestions of relevant literature are most welcome!
Edit: I found an answer that works, which I posted below, but I would still be happy if anyone has suggestions for relevant literature (books or papers), or other more rigorously presented solutions.




