7

Sometimes I have seen likelihood written as $L(\mu,\sigma |y)$ and sometimes as $L(y|\mu,\sigma)$.

I have been told that in the first case it means that there is a pre-assumed model depicting the probability density of $\mu$ and $\sigma$ parameterized by $y$. In the second case it means the pre-assumed model depicts the probability density of $y$, parameterized by $\mu$ and $\sigma$

I have seen the $|$ symbol used before in Bayes theory and read it as "given that"

I am from a programming background (C#) and tend to think of parameters as inputs to a function.

How should I be thinking about this word in statistics?

Kirsten
  • 763

2 Answers2

7

“Parametrized by” means that the function $f$ of $x$ has additional parameters $\Theta$, so we can write it as $f(x;\Theta)$. In such a case, we call the function by $x$ given a fixed value of $\Theta$.

Likelihood has two meanings, the traditional on, and Bayesian. Traditionally, likelihood is written as

$$ L(\theta|x) \propto P(x|\theta) $$

The vertical bar $|$ is used on the right-hand side to denote conditional probability and is a slight abuse of notation on the left-hand side. People write it as $L(\theta|x)$ to show that we keep the data $x$ fixed, but we evaluate the function for different parameters $\theta$. In a Bayesian setting, you usually would not see the left-hand side notation, just the right-hand side will be called the likelihood (here you might have seen $L$ used instead of $P$). See Wikipedia entry on likelihood seems ambiguous for more discussion.

Tim
  • 138,066
  • I suspect many Bayesians are happy with $L(\theta \mid x)$ though some would say $L(\theta \mid x) \propto P(x \mid \theta)$. A key point of such a likelihood is that it is function of $\theta$ but does not need to sum or integrate to $1$ over $\theta$. – Henry Oct 10 '22 at 19:22
  • Hm. That is right, Tim. I never encountered likelihood to be written in that expression. – User1865345 Oct 10 '22 at 19:23
  • @Henry the multiplicative constant. That sums it up. – User1865345 Oct 10 '22 at 19:23
  • @User1865345 If I flip a biased coin with parameter $p$ of heads and see $H,T,H,T,H,H$ then the likelihood for a particular value of $p$ is the same as if I record $4$ heads and $2$ tails in total. It does not matter whether I write this as $p^4(1-p)^2$ or $15 p^4(1-p)^2$ – Henry Oct 10 '22 at 19:35
  • Exactly @Henry. I agreed with your initial as well this comment. Nothing of any disagreement here. I just mentioned the multiplicative constant which doesn't matter. – User1865345 Oct 10 '22 at 19:37
1

What is parameter? What is a parametric model?

Definition $1.$ Let $(\Omega,\mathfrak F) $ be a probability space. The set of probability measures $\{ \mathbb P_\theta:~{\boldsymbol \theta\in\Theta}\}$ indexed by a parameter $\boldsymbol\theta$ is a parametric family if and only if $\Theta\in\mathbb R^n,~n\in\mathbb Z^{>0}$, and each probability measure is known when $\boldsymbol\theta$ is known. Here $\Theta$ is parameter space.

Remark $1.$ A parametric model assumes the population comes from a parametric family.

Example $1.$ A parametric family of $n\in\mathbb Z^{>0}$ dimensional normal distributions indexed by $( \boldsymbol\mu, \mathbf \Sigma) $ is

$$\{\mathcal N_n(\boldsymbol \mu,\mathbf \Sigma): \boldsymbol\mu\in\mathbb R^n,\mathbf \Sigma\in\mathcal M_n\}.\tag 1$$

So, basically the probability measure is labeled or indexed by parameter(s) and the primary objective of inference is to draw information about the parameter $\theta$ in order to know the probability measure (if it is $\sigma$-finite, then the set is identified by the densities; so the problem is to infer about the parameter to draw information about the associated pdf).

$\bullet$ $\mathcal L(\boldsymbol\theta|\mathbf y) $ denotes, given that $\mathbf Y= \mathbf y$ is realised or observed, the likelihood of parameter $\boldsymbol\theta.$


References and further read:

$\rm [I]$ Mathematical Statistics, Jun Shao, Springer Science$+$Business Media, $2003, $ section $2.1.2, $ p. $94.$

$\rm [II]$ Testing Statistical Hypotheses, E. L. Lehmann, Joseph P. Romano, Springer Science$+$Business Media, $2005, $ section $1.1, $ p. $3.$

$\rm [III]$ Statistical Inference, George Casella, Roger L. Berger, Wadsworth, $2002,$ section $6.3, $ p. $290.$

User1865345
  • 8,202