0

I would like to have a clear understanding of what exactly is a likelihood function in Bayes' Theorem, and why isn't considered a probability. As well as the distinction between the likelihood in discrete and continuous distribution.

Bayes' Theorem:

$$ P(m|D) = \frac{P(D|m)*P(m)}{P(D)} $$

Where m is the parameters of the model and D is the observed data set.

My understanding of likelihood is that it is simply a function with respect to m where the data set is already given. i.e :

$$L(m|D) = P(D|m)$$

I understand why the likelihood function is not a PDF since integrating it does not necessarily equal to one.

I don't understand why it is not considered a probability. I mean we are still getting a probability if we chose parameters for the model. Is it only to make the distinction that we are taking the function with respect to the model (with known dataset)? What I mean that, is it the fact that we don't know the parameters of the model makes P(D|m) a likelihood, since now it is a function with respect to the parameters of the model. Thus, plugging in the parameters of the model will give us a quantified probability for P(D|m) ?

Likelihood in Continuous vs Discrete distributions

It seems to me that likelihood is used differently in discrete and continuous distributions

For example, in Bernoulli distribution (discrete distribution) the likelihood function is given by:

$$ L(p | x_1,x_2, ..., x_n ) = \prod_{i=1}^{n} p^{x_i}*(1-p)^{(1-x_i)} $$

The intuition makes sense since we are multiplying the probabilities given unknown parameter p.

The confusion rises when we consider the likelihood of a continuous Gaussian Distribution. Which is:

$$ L(u,\sigma^2 | x_1,...,x_n) = (2\pi\sigma^2)^{-\frac{n}{2}}*\exp(-\frac{1}{2\sigma^2}*\sum_{i=1}^n(x_i-u)^2)$$

Basically, plugging in the independent variables in Gaussian PDF and multiplying them together. However:

$$P(x_i| u,\sigma^2) = 0 $$

In a continuous distribution so why are not integrating them in the likelihood function.

My presumption is that it doesn't matter integrating in the likelihood function since we usually use it to find the maximum of P(m|D). But we have still have P(D|m) which should be a quantifiable probability.

Edit

This is not a duplicate question of What is the difference between "likelihood" and "probability"?.

The accepted answer to the question is precisely my question. The answer states "In the continuous case the situation is similar with one important difference. We can no longer talk about the probability that we observed O given θ because in the continuous case P(O|θ)=0."

Why in the continuous case we no longer talk about the probability, yet we do in the discrete case.

  • 2
    You say "I understand why the likelihood function is not a PDF since integrating it does not necessarily equal to one" and "I don't understand why it is not considered a probability". If $L(m \mid D)$ was a probability for $m$ then you would have $\sum\limits_m L(m \mid D) =1$ but you usually do not. If $L(m \mid D)$ was a probability density for $m$ then you would have $\int\limits_m L(m \mid D) =1$ but you usually do not. – Henry Nov 07 '22 at 01:51
  • 2
    In fact your measure of likelihood is any case only proportional to what your are interested in: turning your Bernoulli evidence into Binomial evidence would make it ${n\choose \sum x_i} p^{\sum x_i}(1-p)^{n- \sum x_i}$ which is ${n\choose \sum x_i}$ times your product, even though the information about $p$ from the observations has not changed. This would make no sense if you were talking about a probability. – Henry Nov 07 '22 at 01:59
  • 1
    In the continuous case, $P(D|m)$ as a function of $D$ is a density, no a probability, and since it depends on the dominating measure, it is defined up to a function of $D$. – Xi'an Nov 07 '22 at 07:32

0 Answers0