3

MLE (maximum likelihood estimation) can be defined mathematically for discrete or continuous variables. But there is a technical specificity about variables being neither discrete nor continuous. Instead they are a mixture of discrete and continuous. More precisely:

Assume you have a model with a parameter $\theta$ and an observable variable $X$ whose distribution depends on $\theta$. You have $x_1,x_2...x_n$ an independent sample of real-life observations of $X$.

If $X$ is a discrete variable, the maximum likelihood estimator is:

$$\hat\theta=\text{argmax}_\theta \left(\displaystyle\prod_{i=1}^n P_\theta(X=x_i)\right)$$

If $X$ has a density $p_\theta$, then you just use the density instead:

$$\hat\theta=\text{argmax}_\theta \left(\displaystyle\prod_{i=1}^np_\theta(x_i)\right)$$

Sometimes, $X$ is essentially continuous but has one or several atoms: the mixture of a continuous distribution and a discrete distribution. A good example : ML estimate of exponential distribution (with censored data).

I want to find a way to define MLE mathematically when you have atoms for students or people with some mathematical background but not so much in statistics. Ideally :

  • not too much theoretical or abstract
  • rather general
  • not uselessly confusing

I struggle. Any idea?

  • 1
  • My question is a focus on the mathematical definition of MLE with neither discrete nor continuous variables. I think the question you are referring to is more about an intuitive general explanation of MLE. I've rephrased a bit. – Benoit Sanchez Jul 15 '17 at 21:11
  • 1
    But the definition is the same no matter if your data is continuous or discrete! Probability mass functions is a special kind of density function, so there is no problem with mixed data. – Tim Jul 15 '17 at 21:16
  • Not a special case (https://en.wikipedia.org/wiki/Probability_mass_function). Unless you talk of Random-Nikodym derivative to the counting measure. I especially want to avoid these theoretical things. Anyway, somebody could explain this of idea of yours as an answer. This would not make my question a duplicate. – Benoit Sanchez Jul 15 '17 at 21:23
  • 1
    But if you want to discuss mixed data type then you can't run away from those theoretical considerations! this thread gives nice introduction to probability densities. You basically need to introduce probability densities and discuss discrete data as a special case for it. If probability density is "probability per foot", then with discrete data you have obvious units you calculate it for. – Tim Jul 15 '17 at 21:40
  • Again you opinion and thoughts are welcome as an answer. My only point is : my question is not a duplicate. – Benoit Sanchez Jul 15 '17 at 22:12
  • A typical case where this occurs is with censoring. For some examples see: https://stats.stackexchange.com/questions/87065/weighted-normal-errors-regression-with-censoring/276929#276929 https://stats.stackexchange.com/questions/133347/ml-estimate-of-exponential-distribution-with-censored-data/133360#133360 and many others – kjetil b halvorsen Mar 28 '18 at 11:02

0 Answers0