10

We distribute $n$ balls at random into $k$ urns, with equal probability, and record the maximum number of balls falling into any of the urns.

What is the distribution of this maximum?


What is the probability that throwing $m$ balls at random in $n$ urns at least one urn contains $c$ elements? points us to Raab & Steger (1999), which gives some rather complicated asymptotic results. I have briefly skimmed which papers cite this paper, but it doesn't look like there is much helpful stuff there.

Even non-asymptotic bounds would be nice to have.


Of course, by the pigeonhole principle, the maximum can't be lower than $\frac{n}{k}$, and it can't be higher than $n$. A simulation is easy, here is some R code:

n_balls <- 50
n_urns <- 10
n_sims <- 1e5

Maximum <- replicate(n_sims,max(table(sample(x=1:n_urns,size=n_balls,replace=TRUE)))) hist(Maximum,breaks=seq(min(Maximum)-.5,max(Maximum)+0.5),freq=FALSE)

histogram

No, this is not homework (and judging from some searching on the internet, it may well be that the formulas are so unwieldy it would not make for a good homework question). It came up in the context of a related probability question on a mailing list.

Stephan Kolassa
  • 123,354

1 Answers1

2

I have an attempt at an estimate here. But I can't make the computations work well because it involves a deconvolution step which is not very stable

  • When I decrease the number of urns then the deconvolution with fft starts to become problematical.
  • When I increase the number of urns then the solution is not as good as I would have expected. Especially for the Gaussian distribution I would expect a nearly perfect fit. I may have made some little mistakes due to the bins and rounding off, and possibly a little shift, but I guess that maybe there is some bigger problem.

The principle behind the estimate is

  1. The adaptation of a naïve approach that assumes all the urns to be independent. In that case the distribution of the maximum is the power of the CDF of a single case. $$P(\max(X_i)\leq x) = P(\text{all $X_i \leq x$}) = F(x)^{n_{urns}}$$

  2. The case with the urns is the maximum of a multinomial distribution which has variance $npq$ and covariance $np^2$. We can relate this to a multivariate normal distribution.

  3. In the case of a multivariate distribution with negative correlation we can get a multivariate distribution with zero correlation by adding a normal distributed variable. So $\max(x) \sim \max(y) + z$ where $x$ is our target variable, $\max(y)$ we can compute with our naive method, the addition of $z$ is effectively a convolution with a normal distributed variable.

    The idea is then to find the distribution of $\max(x)$ by deconvolving $\max(y)$.

    Note: this mimics the situation of the question where the correlation is positive and the solution asks for a convolution instead of a deconvolution CDF of maximum of $n$ correlated normal random variables

  4. We apply the same deconvolution as with the multivariate normal distribution, but now to the multinomial distribution.

example image

set.seed(1)
n_balls <- 50
n_urns <- 10
n_sims <- 1e4

layout(matrix(1:2),2)

Maximum = replicate(n_sims,max(table(sample(x=1:n_urns,size=n_balls,replace=TRUE))))

h1=hist(Maximum,breaks=seq(min(Maximum)-1.5,max(Maximum)+1.5),freq=FALSE, main = "maximum of multinomial")

mean and covariance matrix of multinomial

mu = rep(n_balls/n_urns,n_urns) p = 1/n_urns q = 1-p xVAR = n_ballspq xCOV = -1n_ballsp*p sigma = matrix(rep(xCOV, n_urns^2),n_urns) diag(sigma) = xVAR

variance of convolved component

varconvolve = n_balls*p/n_urns

computation of naive estimate

d = 1 k1 = seq(0,mu[1]*4,d) Fk1 = pbinom(k1,n_balls,1/n_urns)^(n_urns) fk1 = diff(c(0,Fk1))/d

function to deconvolve with Gaussian

deconvolve = function(dt,y,var,SN=10^3) { L = length(y) t = seq(0,L-1)dt x = dnorm(t,0,var^0.5)/dt2 y = c(y,rep(0,L)) # padding of zeros x = c(x,0,rev(x[-1])) x = x/sum(x) fftx = fft(x) g = Conj(fftx)/(Mod(fftx)^2+1/SN) z = (Re(fft(fft(y)g,inverse=T))/(2L))[1:L] return(z[1:L])
}

y = deconvolve(d,fk1,varconvolve,SN=10^18)

lines(k1,fk1, lty = 2) lines(k1,deconvolve(d,fk1,varconvolve*0.9,SN=10^18))

case of Gaussian

Maximum = apply(MASS::mvrnorm(n_sims,mu,sigma),1,max)

h2=hist(Maximum,breaks=seq(min(Maximum)-1.5,max(Maximum)+1.5,0.25),freq=FALSE,main = "maximum of multivariate Gaussian with negative correlation")

computation of naive estimate

d = 0.25 k2 = seq(-mu[1]4,mu[1]4,d) Fk2 = pnorm(k2,mu[1],xVAR^0.5)^(n_urns) fk2 = diff(c(0,Fk2))/d z = deconvolve(d,fk2,varconvolve,SN=10^9)

lines(k2,fk2, lty = 2) lines(k2,z)

legend(10,0.4,c("naive estimate","deconvolved estimate"), lty = c(2,1))

  • An alternative is to take the naive approach and instead of performing a deconvolution we change of the variance by scaling the computer CDF. This does improve the result, but not as much as the deconvolution. The scaling doesn't have the asymmetric effect that the deconvolution has. – Sextus Empiricus Feb 21 '23 at 17:37