6

Let $x_1,\dots,x_n$ be i.i.d. observations from $N_p(0,\Sigma)$. Let $\hat S=\frac1n\sum_{i=1}^n x_ix_i^T$ be the sample covariance of the samples. Recall that the Mahalanobis distance is defined: $d_i(x_i)^2=x_i^T\hat S^{-1}x_i$

A previous question considered the distribution of $d(\bar{x})$.

What can we say about the distribution of $d_i(x_i)$? Has this been studied? If this has been done, a reference would be appreciated. I'm not sure it can be connected to the previous question.

  • 2
    Hint: Notice that you can choose $\Sigma = I$ without loss of generality. – cardinal Jun 15 '15 at 20:48
  • That is not quiet correct. Your definition of the squared MD is a bit off: it should be $(x-\hat{\mu})^\top\hat{S}^{-1}(x-\hat{\mu})$ – user603 Jun 19 '15 at 20:36

2 Answers2

3

if $\{\pmb x_i\}_{i=1}^n$ is your data with $\pmb x_i\underset{\text{i.i.d.}}{\sim}\mathcal{N}_p(\pmb \mu,\pmb \varSigma)$ where $\pmb \mu\in\mathbb{R}^p$ and $\pmb \varSigma\succ0$ and we denote: $$(\mbox{ave}\;\pmb x_i,\mbox{cov}\;\pmb x_i)$$ the usual Gaussian estimates of mean and covariance, then $$d^2(\pmb x_i,\mbox{ave}\;\pmb x_i,\mbox{cov}\;\pmb x_i)=(\pmb x_i-\mbox{ave}\;\pmb x_i)^\top(\mbox{cov}\;\pmb x_i)^{-1}(\pmb x_i-\mbox{ave}\;\pmb x_i)$$

has distribution [0, p113][1, p562]:

$$d^2(\pmb x_i,\mbox{ave}\;\pmb x_i,\mbox{cov}\;\pmb x_i)\sim\frac{(n-1)^2}{n}\mbox{Beta}\left(p/2,(n-p-1)/2\right)$$

  • [0] Gnanadesikan, R. and Kettenring, J.(1972). Robust estimates, residuals, and outlier detection with multiresponse data. Biometrics,28:81--124.
  • [1] Wilks, S. (1962). Mathematical Statistics. John Wiley.
user603
  • 22,585
  • 3
  • 83
  • 149
2

If your estimate of $\Sigma$ is not too far off, it is the Euclidean distance of a multivariate standard normal distribution, i.e. $\chi$ distributed.

To understand this, assume your estimate is perfect: $\hat S =\Sigma$. The math then should be straightforward, because you can essentially remove all the variance, and it is the same solution as for $\Sigma=I$.

For small sample sizes $n$, the results can be all over the place. The covariance matrix can be arbitrary bad for small samples - just assume all your sample is the same point, $n$ times (so this could still happen at $n\rightarrow\infty$, except that it is unlikely to happen).