2

Please help me how to find the Fisher information matrix for the MLE of a Multinomial ... It is said that considering $n$ and $p$ constraints in multinomial distribution. In other words, when $p$ has $k$-dimension, real parameter number is $k-1$.

dunk
  • 31

1 Answers1

2

The Fisher information function is the variance of the score function, so you start by finding the latter. If you have an observed data vector $\mathbf{X} \sim \text{Mu}(\mathbf{p})$ using the probability vector $\mathbf{p} =(p_1,...,p_k)$ then you get the log-likelihood function:

$$\ell_\mathbf{x}(\mathbf{p}) = \text{const} + \sum_{i=1}^k x_i \log(p_i),$$

which gives you the score function:

$$s_\mathbf{x}(\mathbf{p}) \equiv \nabla \ell_\mathbf{x}(\mathbf{p}) = \bigg( \frac{x_1}{p_1},...,\frac{x_k}{p_k} \bigg) = \text{diag}(1/\mathbf{p}) \mathbf{x}.$$

Consequently, the Fisher information matrix is:

$$\begin{align} \mathcal{I}(\mathbf{p}) \equiv \mathbb{V}(s_\mathbf{X}(\mathbf{p})) &= \mathbb{V}(\text{diag}(1/\mathbf{p}) \mathbf{X}) \\[12pt] &= \text{diag}(1/\mathbf{p}) \mathbb{V}(\mathbf{X}) \text{diag}(1/\mathbf{p}) \\[12pt] &= n \text{diag}(1/\mathbf{p}) [\text{diag}(\mathbf{p}) - \mathbf{p}\mathbf{p}^\text{T}] \text{diag}(1/\mathbf{p}) \\[12pt] &= n [\text{diag}(1/\mathbf{p}) - (\text{diag}(1/\mathbf{p}) \mathbf{p}) (\text{diag}(1/\mathbf{p}) \mathbf{p})^\text{T}] \\[12pt] &= n [\text{diag}(1/\mathbf{p}) - \mathbf{1} \mathbf{1}^\text{T}] \\[12pt] &= n \begin{bmatrix} \frac{1-p_1}{p_1} & -1 & -1 & \cdots & -1 \\ -1 & \frac{1-p_2}{p_2} & -1 & \cdots & -1 \\ -1 & -1 & \frac{1-p_3}{p_3} & \cdots & -1 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ -1 & -1 & -1 & \cdots & \frac{1-p_k}{p_k} \\ \end{bmatrix}. \end{align}$$

The above treatment is for the full parameter vector with $k$ elements, but you can reduce this to the corresponding parameter vector with $k-1$ elements by taking $p_k = 1-\sum_{i=1}^{k-1} p_i$. In the latter case you would rewrite the score function using this equivalence and then you would get a $(k-1) \times (k-1)$ invertible variance matrix. I will leave this as a useful exercise for you.

Ben
  • 124,856
  • Thank you! very helpful! – dunk Jun 14 '22 at 06:29
  • @Ben. Hi, I directly compute the co-variance matrix $\Sigma$ of the estimator $\mathbf{p}$ and get the result in Thm 14.6. I believe the Fisher information matrix you gave is correct. However, I find $\mathcal{L}$ does not equal to the inverse matrix of $\Sigma$. Do you know why? – Mingzhou Liu Jun 30 '22 at 08:56
  • @ Ben: can this matrix be used to derive the formulas for the variance of each "p_ij"? – stats_noob Dec 14 '22 at 06:30
  • @Ben -- Please note that the matrix should be of order $k - 1$. Your info matrix couldn't give the correct asymptotic variance of MLE (it is not invertible). – Zhanxiong Feb 12 '23 at 11:03
  • It is not invertible because here I am using the parameter vector $\mathbf{p} = (p_1,...,p_k)$ of order $k$. If you want to use the reduced parameter vector then you would make appropriate changes to the last element. – Ben Feb 12 '23 at 11:18
  • I cannot comment on Ben's answer so I have to put my response here. Actually, I have the name Ben too, so please call me Ben2 in this post. s(p) is not correct. Note that p1 + ... + pk = 1. From another perspective, pk is a function of others. So when you take derivative to p1, the result is not x1/p1. The evidence that the final matrix is wrong is from Mingzhou Liu's comment above. The final Fisher information matrix should be a (k-1)-by-(k-1) matrix. And then the covariance matrix can be got by taking inverse. Meanwhile, the covariance between pk and others can be viewed as the covariance be – JJ. Feb 12 '23 at 09:05
  • This is a classical example, most references I am aware of presented $I(p)$ as an order $k - 1$ invertible matrix. Besides, OP does mention that real parameter number is $k - 1$, so I think a standard answer should be of order $k - 1$. – Zhanxiong Feb 12 '23 at 14:31
  • Fair enough --- I have added a final paragraph to note the general method by which you would do this, but in this case I think I will leave that as an exercise for OP. – Ben Feb 13 '23 at 07:16