6

Learning about MLE involves the optimisation of log-likelihood, which allows us to get the "best" value of theta for observing a certain sample.

What then is the purpose of showing the same estimator maximises the expectation of log-likelihood? Do we not take it as given that since it maximises the log-likelihood, that it would definitely be the case that adding an expectation (mean) doesn't really value add anything?

enter image description here

Thank you!

1 Answers1

12

The maximum likelihood estimator $$\hat\theta(z_1,\ldots,z_n)$$ is the solution of the maximisation program $$\arg\max_\theta\sum_{i=1}^n \log \{f_Z(z_i;\theta)\}\tag{1}$$ It is therefore a random variable since it depends on one realisation of the sample $(Z_1,\ldots,Z_n)$. The justification in using the maximum likelihood estimator is that, since the true value $\theta_0$ of the parameter (i.e.~the one value behind the generation of $(z_1,\ldots,z_n)$) is solution of $$\theta_0 = \arg\max_\theta \mathbb E_{\theta_0}[\log \{f_Z(Z;\theta)\}]\tag{2}$$ and since $$\frac{1}{n}\sum_{i=1}^n\log \{f_Z(z_i;\theta)\} \approx \mathbb E_{\theta_0}[\log \{f_Z(Z;\theta)\}]$$ thanks to the Law of Large Numbers, the solutions to (1) and (2) should be close:$$\hat\theta(z_1,\ldots,z_n)\approx\theta_0$$(which can be shown rigorously, of course).

Xi'an
  • 105,342
  • thank you so much! could i ask: is the significance of the second part just to show that the estimator is consistent? and is it supposed to be "intuitive" to get equation 2 from equation 1? – jojorabbit Nov 20 '21 at 20:28
  • 1
    The intuition is the law of large numbers. As $n$ grows the average stabilises around its expectation. – Xi'an Nov 20 '21 at 21:44