In discussing Gaussian mixture models (GMMs), https://normaldeviate.wordpress.com/2012/08/04/mixture-models-the-twilight-zone-of-statistics/ highlights the issue of
Multimodality of the Likelihood. In fact, the likelihood has many modes. Finding the global (but not infinite) mode is a nightmare. The EM [expectation-maximization] algorithm only finds local modes. In a sense, the MLE is not really a well-defined estimator because we can’t really find it.
Is the multimodality of the likelihood "fixed" (does the likelihood become uni-modal) when class labels $Y$ are partially observed?
By "class labels $Y$", I mean $Y \in \left\{0, 1, 2, \ldots, k-1\right\}$ indicating the component from which the observation was drawn, where the number of components $k$ is known ahead of time. I'm assuming a data generating process in which $Y$ is missing completely at random -- for example, we might observe $Y$ for a random 5% of our data, with $Y$ missing for the remaining 95%, and missingness occurring independently of $X$.
Here is an image illustrating what I have in mind:
The labels $Y$ for all points are shown below:
If we ran the expectation maximization (EM) algorithm with the dataset where Y is partially observed, would we find that the likelihood is unimodal? If so, how many $Y$s need to be observed in order for that to be true? Would observing at least one $Y$ from each of the $k$ classes/components be sufficient? Would we need to observe at least one Y for only $k-1$ of the components?
Edit: when $Y$ is fully observed (as in the second graph, the one immediately above), we could estimate the mean and variance-covariance matrix for each of the $k$ components separately, and in this case the likelihood has a single maximum and a relatively simple closed-form solution (see Maximum Likelihood Estimators - Multivariate Gaussian). So my question is essentially about what happens to the likelihood (when does it become multimodal) as the probability of $Y$ being missing/unobserved increases from $p=0$ to $p \in \left(0, 1\right)$ to $p=1$.

