Say I have a list of 10000 sorted counts $\ge 0$ that sum to 20000 and I want to determine that they came from placing 20000 balls randomly, independently and uniformly into 10000 cells. A straight Chi-squared test would be reasonable if I assume independence, but 10000 counts of two would be blessed by chi squared, but I'd be somewhat skeptical of independence (randomness as well). I could compare the counts with a Binomial(20000,1/10000) ~ Poisson(2), but the counts are no longer independent, so a Chi-squared test is probably no longer reasonable even if I compute the covariance matrix for the counts.
Surely this has been dealt with before. (I'm interested in the case where the probabilities are non uniform, but want to understand the simple case first)
Added: Calculation of variances.
(All the cleverness in what follows is due to Robert Israel and André Nicolas.)
If we draw $n$ samples from a $\mathrm{Poisson}(\lambda)$ distribution then we expect to draw a count of $r$ $$E(X_r) = n \frac{\lambda^r}{r!}e^{-\lambda}$$ times. As the process of drawing a count of $r$ can be considered as $n$ independent Bernoulli trials, the variance would be $$\mathrm{Var}(X_r) = n \frac{\lambda^r}{r!}e^{-\lambda}\left(1-\frac{\lambda^r}{r!}e^{-\lambda}\right)$$.
Now, if we place $\lambda n$ balls in $n$ cells we will (as $n\to\infty$) get the same expected number of counts. The variance is somewhat more problematic. Let $Y_{r,i}$ be an indicator random variable representing the event that cell $i$ receives exactly $r$ balls. Then $X_r = \sum_i Y_{r,i}$. We have that $\mathrm{Var}(X_r) = E(X_r^2)-E(X_r)^2$. Splitting $X_r^2$ into a sum of $Y_{r,i}^2=Y_{r,i}$ (as $Y_{r,i}$ is either $0$ or $1$) and a sum of $Y_{r,i}Y_{r,j}$ over $i\ne j$. For $i\ne j$ we have $$E(Y_{r,i}Y_{r,j}) = \binom{\lambda n}{2r}\binom{2r}{r}\left(\frac{1}{2}\right)^{2r}\left(\frac{2}{n}\right)^{2r}\left(1-\frac{2}{n}\right)^{\lambda n - 2r}$$ So we have $$ \text{Var}(X_r) = n E(X_r) + n(n-1)E(Y_{r,i}Y_{r,j}) - (nE(X_r))^2$$ Taking the limit as $n\to\infty$ we get $$\text{Var}(X_r)=n\left(\frac{\lambda^r}{r!}e^{-\lambda}-\frac{\left(r^2\lambda^{2r-1}-(2r-1)\lambda^{2r}+\lambda^{2r+1}\right)}{(r!)^2}e^{-2\lambda}\right)$$ (taking the limit is fiddly, see Robert Israel's post for the simplest case).
This can be extended to compute the covariances (I'm working on it in my copious free time) and from there we can compute a better Chi-squared statistic. I am still not convinced that we cannot do better. I am also surprised that this is not a well known problem.
Update This is the final update (I have now gotten the algebra to cover the handwaving in the initial formulation.)
The covariance of the counts is $$\Sigma = n(D - p'p - q'q)$$
where $p = p_0,p_1,\dots$, and $q = q_0,q_1,\dots$ where
$$p_i = \frac{\lambda^i}{i!}e^{-\lambda},$$ and
$$q_i = \frac{i-\lambda}{\sqrt{\lambda}}\frac{\lambda^i}{i!}e^{-\lambda},$$
and $D$ is the diagonal matrix made of the elements of $p$.
Note that the covariances are far from 0, making the assumption of
independence underlying the use of a straight $\chi^2$ test somewhat suspect.
This result is kind of cute as we can use the Sherman-Morrison-Woodbury method to find the inverse (of a truncation of the infinite matrix as the inverse diverges.)
We can use the fact that if an $n$ dimensional random variable $X$ is distributed as $N(\mu,\Sigma)$ then $(x-\mu)\Sigma^{-1}(x-\mu)$ is a $\chi^2_n$ random variable. As the numbers of each count is essentially a binomial R.V. we can pretend that at least for the small counts they are normally distributed. This sugests using $(x-\mu)\Sigma^{-1}(x-\mu)$ as a test statistic, where $x$ is the number of small counts. Ignoring the larger counts has the nice side effect of making the problem full dimensional. If we do this in practice we see that if the number of bins we use is small enough, the result is nearly $\chi^2_n$ distributed.
p = sapply(0:7,function(i){2^i*exp(-2)/factorial(i)})
q = sapply(0:7,function(i){2^i*exp(-2)*(i-2)/(factorial(i)*sqrt(2))})
poiscov = diag(p) - (p %o% p + q %o% q)
poiscovinv = solve(10000 * poiscov)
gendat = function(){tabulate(tabulate(sample(1:10000,20000,replace=TRUE),
nbins=10000) + 1,nbins=8)}
dat = t(replicate(10000, gendat()))
mn = 10000 * p
ch = apply(dat,1,function(v){(v-mn) %*% poiscovinv %*% (v-mn)})
The mean and variance look about correct
> mean(ch)
[1] 8.092341
> var(ch)
[1] 17.10295
It passes the eyeball goodness of fit test.
> library(MASS)
> truehist(ch)
> x = seq(0,40,length=100)
> lines(x,dchisq(x,8))

It turns out, unsurprisingly, that the distribution looks incrasingly unlike a $\chi^2_n$ distribution as $n$ increases, the tail becomes much heavier, and I am not certain what to do about that. I have tried adding all of the counts in the tail to the final bin, but that does not seem to help.
I cannot say that this test is definitely more powerful, than the straight Chi squared test, as I can find points that are accepted by one and not the other, but somehow this seems to take into account more of the structure of the problem. Intuitively this method takes into account the shape of the contours of the probability distribution. It not surprisingly places a fair amount of weight on the condition on the sum of the bins, and actually weights the higher numbered bins less than the smaller bins (the opposite of the chi-squared test).

