2

I have a probabilistic Machine Learning model that models the data distribution using a multivariate T distribution. This distribution can have density > 1, and so when I compute log likelihood of the data, I sometimes get positive values (meaning likelihood was > 1).

What do researchers typically do in this scenario? Just report the positive log likelihood? Or do they truncate or try to change it in some other way so LL is in (-infinity, 0]?

Addison
  • 221
  • 4
    Welcome to Cross Validated! I see no problem with reporting the log-likelihood as the logarithm of the likelihood. What problem do you see? – Dave Jul 07 '22 at 18:48
  • Its just I've never seen papers report positive LL values, so I wasn't sure if it'd be seen as a mistake since its kind of a subtle difference between probability and density (i.e. people may think 'Oh shouldn't LL be negative since probabilities have to be between 0 and 1?') – Addison Jul 07 '22 at 19:06
  • 6
    Likelihoods are used so often for comparison that people feel free to omit multiplicative factors that are unnecessary to compute. Thus, the sign of a log likelihood is almost always meaningless. Please see our posts on likelihoods. One consequence is that if you choose to report a likelihood (or its log), you must describe precisely how it is computed. Otherwise, it will be meaningless for all your readers! – whuber Jul 07 '22 at 19:27
  • 1
    The typical range of log likelihood values comes from the typical convenience of scaling the likelihood function to have a maximum likelihood of 1, and thus a maximum possible log likelihood of 0. That scaling is acceptable (and convenient) because likelihoods are almost always used as the ratio of two likelihoods that lie on the same likelihood function. The absolute value of likelihood is almost never of any consequence. – Michael Lew Jul 07 '22 at 21:14
  • @Michael I like that idea of scaling, but I have never seen anyone do it. I do not challenge that it occurs, but I wonder how "typical" it is. Perhaps you are thinking of analyses and plots of likelihood ratios? – whuber Jul 08 '22 at 16:21
  • 1
    @whuber Richard Royal uses it throughout his book Statistical Evidence: A Likelihood Paradigm, as does AWF Edwards in his book Likelihood, and Yudi Pawatan in the book In All Likelihood: Statistical Modelling and Inference Using Likelihood. RA Fisher used a percent scale for likelihood functions, which amounts to much the same thing, in his book Statistical Methods and Scientific Inference. – Michael Lew Jul 10 '22 at 08:30

1 Answers1

2

There is no need to do anything special! Your confusion might be the same as the one behind Can a probability distribution value exceeding 1 be OK?, so read that. Also additional useful information in the comments ...