5

We know that quadratic loss can be deduced using maximum likelihood of Gaussian distribution; cross-entropy loss can be deduced using maximum likelihood of Bernoulli distribution.

Now my question is: do some other frequently used loss functions also have such explanation? For examples, what is the probabilistic models underlying hinge loss, exponential loss, L1 loss (Mean absolute error), etc? Can those be interpreted as maximum likelihood estimation for some likelihood?

A proof that every loss function corresponds to some kind of maximum likelihood estimation would be appreciated, as would a counterexample that gives a loss function and proves that it cannot correspond with maximum likelihood estimation for any likelihood. If that counterexample uses a fairly common loss function (e.g., regularization), that is even better.

Dave
  • 62,186
George
  • 63

1 Answers1

1

A loss function $\mathit{L}:X \to\mathbb{R}^+$ can be motivated as an MLE provided that $\int_X \exp (-b\mathit{L}(x) )dx $ converges for some $b > 0$. That will often be the case, but one can construct counterexamples.

Proof: Consider the problem of finding some function $f$ which minimizes the loss $\sum_i \mathit{L}(y_i - f(x_i))$. This will be equivalent to maximizing the conditional log-likelihood $\sum_i \mathscr{L}(y_i - f(x_i))$ provided that $\mathscr{L}$ is an affine transformation of $\mathit{L}$ and that $\mathscr{L}$ is the log of a pdf. In other words, we require that there exists $a \in \mathbb{R},b\in \mathbb{R}^+$ such that $\exp(a - bL(x))$ is a pdf. A function is a pdf if it is non-negative and if it integrates to 1. Non-negativity is guaranteed by the $\exp$ function, so we just require that for some $a,b$: $$ \int_X \exp(a - bL(x)) dx = 1. $$ The constant $a$ can be factored out, so the requirement just becomes that $$ \int_X \exp(- b L(x) ) dx < \infty. \tag{1} $$

I believe that requirement (1) will be failed by the pathological loss function $L(x) = \log(\log(1+|x|))$ and domain $X = \mathbb{R}$: $$ \int_X \exp(- b L(x) ) dx = \int_{-\infty}^{\infty} \exp(- b \log(\log(1+|x|)) ) dx = \int_{-\infty}^{\infty} \frac{1}{(\log(1+|x|))^{b } }dx , $$ which I think diverges for any positive $b$. I'm not sufficiently familiar with different loss functions to know whether there's a 'common' loss function that fails requirement (1), but it shouldn't be too hard to check each one.

Wilbur
  • 211
  • "A loss function $L: X\rightarrow\mathbb R^+$ can be motivated as an MLE provided that $\int_X \exp(-bL(x))dx$ converges for some $b>0$." Why? Do you have a proof or a reference? – Dave May 27 '23 at 18:29
  • The second part of the answer is a proof of that statement -- will edit to clarify – Wilbur May 28 '23 at 02:34
  • I need to go through this in more detail than I can right now in order to accept this answer, but you took the time to attempt to answer and edited in response to my follow-up. +50 – Dave May 30 '23 at 19:07