Explain: Latent Variable (e.g for Latent Dirichlet Allocation)

Question

I am trying to understand the Latent Dirichlet Allocation but therefore I need the basic understanding about what exactly a latent variable is in that sense.

I know that the basic idea of a latent variable is something unobserved (like an unknown parameter) that is assumed to explain an observed event.

Could somebody explain to me (preferable in easy words) what exactly a latent variable is and perhaps give a simple example just to get me going?

Something I'd also would like to know is if the Maximum Likelihood Estimate $\mathcal{L}(\theta;x_1,\ldots,x_n) = f(x_1,\ldots,x_n|\theta) = \prod_{i=1}^n f(x_i|\theta)$ e.g., where we want to find a parameter $\theta$ that models the observations $x_1,\ldots,x_n$ best, could be formalized in an environment depending on a latent variable or am I mixing things up here?

score 1 · Accepted Answer · answered Feb 09 '15 at 18:31

You are correct, latent variables are variables that appear in your model yet are not directly observed; that's also pretty much the dictionary definition of the word. They show up in a bunch of places, but let me explain the way I see latent variables.

Note that my definition of a latent variable is probably too general. The way I see them show up most frequently is when they are in some sense nuisance variables, that is one has to estimate them from data, yet they are not wanted. I believe that is how they appear in MLE estimates: you have to integrate them out (e.g. through sampling) as they are nuisance parameters.

Noisy measurement

Let's say you have a simple model $y = x + \epsilon$, with $\epsilon \sim N(0, \sigma^2)$, where $x$ is the true values of some property and $y$ the noisy measurement. When a Bayesian model is built by adding a prior density for likely values of $x$, say something flat like $p_0(x) = N(0, 10^2)$, you could say that $x$ is a latent variable, as it is never observed directly: only noisy measurements of it are available.

Hidden Markov model

Like I said above, latent variables are typically nuisance parameters. Let me give you an example that involves Hidden Markov models (HMMs). In an HMM, you claim that there is some quantity $x_i$ that evolves via a Markov process $f(x_{i+1} \mid x_i, \theta)$, so that the likelihood becomes $$ p(x \mid \theta) = p_0(x_0) \prod_{i=1}^n f(x_{i+1} \mid x_i, \theta). $$ One is interested in the model parameters $\theta$, but unfortunately we do not observe $x_i$ but rather a related quantity $y_i$; typically we are given a conditional probability $g(y_i \mid x_i)$, which could be a simple Gaussian noise like above. $$ p(x,y \mid \theta) = p_0(x_0) \prod_{i=1}^n f(x_{i+1} \mid x_i, \theta) g(y_i \mid x_i). $$ According to my personal definition, both $\theta$ and $x_i$ are latent variables because none are observed. However, the typical person would say that $x_i$ are latent variables: they form the latent (hidden) chain that must be estimated from noisy observations, yet we don't really care about the values of $x_i$.

Other examples

You can go much further than these examples. For example, latent variables are introduced into the probit regression model when Gaussian sampling is involved. Mixture models typically also involve latent variables: each data point is assigned a discrete latent variable that describes which cluster (mixture) it has been assigned to; here it might be of interest if one wants to do classification, but typically are nuisance because we wish to learn the parameters of the clusters instead.

Explain: Latent Variable (e.g for Latent Dirichlet Allocation)

1 Answers1

Noisy measurement

Hidden Markov model

Other examples