2

Let $\theta = (\theta_1, \theta_2, \dots)$ be a vector of parameters, and let $L(\theta) = L(\theta|y)$ denote the likelihood function with respect to observed data $y$.

Following the notation from here, suppose $\delta = \theta_j$ is the parameter of interest, and let $\xi$ denote the other parameters, i.e. $\xi = \theta \setminus \delta$. Then, the profile likelihood $L_p(\delta)$ takes the following expression:

$L_p(\delta) = \max_\xi L(\delta, \xi|y))$

That is, for each value of $\delta$, this is the maximum of the likelihood function $L(\theta)$ when $\theta_j$ is fixed at $\delta$.

In most examples I have seen, e.g. Figure 1 from here, reproduced below, the profile likelihood is depicted as a concave function, with a global maximum.

enter image description here

Does this hold in general, even when $\delta$ is a vector of parameters?

Alex
  • 4,382
  • 4
  • 34
  • 57

1 Answers1

0

As a profile likelihood is a very general concept, there can be no reason why this should be true in general. Maybe something can be proved within limited model families?

For a non-convex example on this site, see the answer at https://stats.stackexchange.com/a/510726/11887