3

I have a multinominal random number with probabilities $p_1, p_2, \ldots p_n$ for the $n$ outcomes.

If I generate $M$ of these random numbers and count the occurrences of the categories, what is the probability that the category with the largest population probability is also the most frequent in my sample of size $M$?

Gere
  • 2,061

1 Answers1

2

Unfortunately there is no simple expression for this probability --- it is just the sum of the mass function over the set of vectors obeying the requirement. Suppose we have:

$$\mathbf{m} \sim \text{Mu}(M, \mathbf{p}).$$

Letting $k = \arg \max_i p_i$ be the index for the maximum probability (and assuming this is unique) we have:

$$\mathbb{P}(\max m_i = m_k) = \sum_{\mathbf{m} \in \mathcal{M}_k} \text{Mu}(\mathbf{m}|M, \mathbf{p}) \quad \quad \quad \mathcal{M}_k \equiv \{ \mathbf{m} | \max m_i = m_k \}.$$

It might be possible to express this probability recursively, but this is cumbersome unless $n$ is small. A similar function for the binomial distribution ($M=2$) is examined in a reasonable amount of detail in O'Neill (2012) and O'Neill (2015), leading to an iterative formula for a probability function quite similar to this one. Those papers might be of some assistance in showing how to proceed recursively.

Ben
  • 124,856