0

In a single dimension Gaussian, the variance $\sigma$ denotes the expected value of the squared deviation from the mean $\mu$.

I am trying to understand why in the multivariate case of modeling variable $\mathbf{x}$ we end up having a matrix $\Sigma^{-1}$. Why not instead of a vector which in each dimension shows the variance of the input variable $\mathbf{x}$.

From Wikipedia the 2d Gaussian function is represented as:

$f(x,y) = A \exp\left(- \left(\frac{(x-x_o)^2}{2\sigma_X^2} + \frac{(y-y_o)^2}{2\sigma_Y^2} \right)\right)$

Why not use a form like that for the multivariate Gaussian with $\mathbf{\sigma} = [\sigma_{X} \ \sigma_{Y}]^{T}$? Given than my vector $\mathbf{x} = [x \ y]^{T}$.

How this is interpreted in the following example:

enter image description here

Jose Ramon
  • 85
  • 1
  • 14

1 Answers1

2

The probability density function for Bivariate Gaussian is

$$ f(x,y) = \frac{1}{2 \pi \sigma_X \sigma_Y \sqrt{1-\rho^2}} \mathrm{e}^{ -\frac{1}{2(1-\rho^2)}\left[ \left(\frac{x-\mu_X}{\sigma_X}\right)^2 - 2\rho\left(\frac{x-\mu_X}{\sigma_X}\right)\left(\frac{y-\mu_Y}{\sigma_Y}\right) + \left(\frac{y-\mu_Y}{\sigma_Y}\right)^2 \right] } $$

notice that apart from $\mu_X,\mu_Y$ and $\sigma_X,\sigma_Y$, it has the $\rho$ parameter for the correlation between the $X$ and $Y$ variables. If they are uncorrelated, i.e. $\rho=0$, the pdf reduced to what you described.

The same applies to multivariate normal, you could use a covariance matrix that is all-zeros, with the $\sigma$'s on the diagonal. In such a case, the individual variables are assumed to be uncorrelated.

Tim
  • 138,066