Questions tagged [maximum-likelihood]

a method of estimating parameters of a statistical model by choosing the parameter value that optimizes the probability of observing the given sample.

Given certain regularity conditions (e.g. the support of the density function does not depend on the unknown parameter), maximum-likelihood estimators are consistent, efficient (in that they achieve the Cramer-Rao lower bound) and are asymptotically normal with covariance matrix given by the inverse of the Fisher Information matrix.

Given that ML is a parametric method based on a specified distribution family, it relies on the correctness of the assumed distribution model of the data. In many cases it is not possible to find a closed form solution, thereby requiring numerical methods (e.g. Newton-Raphson search).

3367 questions
37
votes
7 answers

Likelihood - Why multiply?

I am studying about maximum likelihood estimation and I read that the likelihood function is the product of the probabilities of each variable. Why is it the product? Why not the sum? I have been trying to search on Google but I can't find any…
RuiQi
  • 645
21
votes
2 answers

What is meant by the standard error of a maximum likelihood estimate?

I'm a mathematician self-studying statistics and struggling especially with the language. In the book I'm using, there is the following problem: A random variable $X$ is given as $\text{Pareto}(\alpha,60)$-distributed with $\alpha>0$. (Of course,…
Stefan
  • 313
19
votes
3 answers

Does MLE require i.i.d. data? Or just independent parameters?

Estimating parameters using maximum likelihood estimation (MLE) involves evaluating the likelihood function, which maps the probability of the sample (X) occurring to values (x) on the parameter space (θ) given a distribution family (P(X=x|θ) over…
Felix
  • 669
  • 2
  • 6
  • 10
18
votes
3 answers

When does maximum likelihood work and when it doesn't?

I'm confused about the maximum likelihood method as compared to e.g. computing the arithmetic mean. When and why does maximum likelihood produce "better" estimates than e.g. arithmetic mean? How is this verifiable?
mavavilj
  • 4,109
14
votes
3 answers

$N(\theta,\theta)$: MLE for a Normal where mean=variance

$\newcommand{\nd}{\frac{n}{2}}$For an $n$-sample following a Normal$(\mu=\theta,\sigma^2=\theta)$, how do we find the mle? I can find the root of the score function $$ \theta=\frac{1\pm\sqrt{1-4\frac{s}{n}}}{2},s=\sum x_i^2, $$ but I don't see which…
user21186
9
votes
2 answers

Why we always put log() before the joint pdf when we use MLE(Maximum likelihood Estimation)?

Maybe this question is simple, but I really need some help. When we use the Maximum Likelihood Estimation(MLE) to estimate the parameters, why we always put the log() before the joint density? To use the sum in place of product? But why? The…
user17670
  • 327
9
votes
2 answers

Maximum likelihood equivalent to maximum a posterior estimation

When is maximum a posterior (MAP) estimation equivalent to maximum-likelihood (ML) estimation?
Jonas
  • 157
  • 2
  • 9
8
votes
1 answer

Why does the log likelihood need to go to minus infinity when the parameter approaches the boundary of the parameter space?

In a recent lecture I was told that, in order for the maximum likelihood estimate to be valid, the log likelihood needs to go to minus infinity as the parameter goes to the boundary of the parameter space. But I don't understand why this is…
mrz
  • 91
  • 1
  • 5
8
votes
1 answer

In Max. Likelihood the expected score is zero for the true values. Is it also true for any other values?

The usual proof of the Expected score in ML expected score being zero goes 'similar' to this: $f(z;\theta)$ is the density function, for data $z$, and parameter vector $\theta$, so $\int f(z;\theta)dz=1$ for any $\theta$. This implies that, under…
7
votes
1 answer

Invariance property of maximum likelihood estimator?

Here is an excerpt from one of the stats books I have been reading: But as a counter example, let's suppose we have five possible values for $\theta$ and $\theta_5$ is the ML estimate, with the likelihood 0.4, and we have a function $f(\theta)$…
qed
  • 2,808
7
votes
2 answers

Quasi maximum likelihood estimation versus pseudo MLE

If I'm not wrong both "quasi" and "pseudo" denote the same thing, namely the optimization under wrong distributional assumptions. Moreover I think that the terms are not restricted to the assumption of normality. Is there an experienced reader who…
Joz
  • 1,072
6
votes
1 answer

Purpose of expectation of loglikelihood

Learning about MLE involves the optimisation of log-likelihood, which allows us to get the "best" value of theta for observing a certain sample. What then is the purpose of showing the same estimator maximises the expectation of log-likelihood? Do…
6
votes
4 answers

Why does Maximum Likelihood estimation maximize probability density instead of probability

I am trying to understand Maximum likelihood estimation but it looks like I am missing something rather elementary. suppose we have an iid random sample $X_1, X_2,..., X_n$ for which the probability density function of each $X_i$ is $f(x_i;…
Curious
  • 61
5
votes
1 answer

Determining MLE of a distribution with sufficient statistics

Please pardon me for asking a simple question but I find this field rather very hard. Anyway, here it goes: We are trying to find MLE for a distribution. Now for a given distribution, its MLE is a number that is difficult to estimate. So what we do…
user1343318
  • 1,341
5
votes
2 answers

ML estimation of parameters that do not completely specify the model

I was wondering how ML is defined when the parameter does not completely specify the model. More concretely, suppose $X_1, X_2, \cdots, X_n$ are drawn iid such that $P(X_1=i)=\theta_i$, $ 1 \leq i \leq k$. I want to find the ML estimate of $\phi=…
Devil
  • 689
1
2 3
8 9