15

I want to actually get the confidence interval of gaussian distribution. I want to know how I can use the covariance matrix and check if the obtained mui vector for the multivariate gaussian distribution actually satisfied the confidence interval. I have a mui vector and the actual values to be obtained. How can I use covariance matrix and the actual values plust mui vector to verify if it satisfied the confidence interval

user31820
  • 1,501
  • This post is answering your question. http://stats.stackexchange.com/questions/7882/how-to-compute-prediction-error-from-relevance-vector-machine-and-gaussian-proce – Areza Jun 05 '12 at 23:48
  • I want to know how I can get the standard deviation such that I can check if the true value is within mui+-standard deviation. It's tricky when it comes to covariance matrix – user31820 Jun 06 '12 at 00:00
  • 1
    A confidence region for what? The mean vector? It would be an ellipsoid involving the inverse of the sample covariance matrix. – Michael R. Chernick Jun 06 '12 at 00:02
  • 1
    Yeah, so if I have a sample lets say x vector. How can I check if that sample lies within the 68 percent region. I mean I can get the standard deviation from the covariance matrix for each variable of the multivariate random vector. Then for each element of the x vector I can check if it lies within the +- standard deviation of the elements of the mui vector. Is this the way to go? – user31820 Jun 06 '12 at 00:38
  • no you construct the 68% confidence ellipse. Find the contour of constant density that contains 68% of the distribution for the sample mean vector within it. – Michael R. Chernick Jun 06 '12 at 02:54
  • Michael Chernick, but how to I construct a alpha% confidence ellipse? My problem is that I have a multivariate normal distribution and I want to construct several confidence ellipses with given confidence levels. Any idea how to do that analytically? –  Sep 20 '12 at 07:22

1 Answers1

25

The quantity $y = (x - \mu)^T \Sigma^{-1} (x-\mu)$ is distributed as $\chi^2$ with $k$ degrees of freedom (where $k$ is the length of the $x$ and $\mu$ vectors). $\Sigma$ is the (known) covariance matrix of the multivariate Gaussian.

When $\Sigma$ is unknown, we can replace it by the sample covariance matrix $S = \frac{1}{n-1} \sum_i (x_i-\overline{x})(x_i-\overline{x})^T$, where $\{x_i\}$ are the $n$ data vectors, and $\overline{x} = \frac{1}{n} \sum_i x_i$ is the sample mean. The quantity $t^2 = n(\overline{x} - \mu)^T S^{-1} (\overline{x}-\mu)$ is distributed as Hotelling's $T^2$ distribution with parameters $k$ and $n-1$.

An ellipsoidal confidence set with coverage probability $1-\alpha$ consists of all $\mu$ vectors such that $n(\overline{x} - \mu)^T S^{-1} (\overline{x}-\mu) \leq T^2_{k,n-k}(1-\alpha)$. The critical values of $T^2$ can be computed from the $F$ distribution. Specifically, $\frac{n-k}{k(n-1)}t^2$ is distributed as $F_{k,n-k}$.

Source: Wikipeda Hotelling's T-squared distribution

  • This is great! I'm only confused about your last sentence. What is $t^2$ here? – Thomas Ahle Apr 14 '22 at 18:31
  • 1
    $t^2$ is defined in the second paragraph: $t^2 = n(\overline{x}-\mu)^TS^{-1}(\overline{x}-\mu)$. Given $n$ and $k$, you can look up the $1-\alpha$ critical value of $T^2$, and that will define the "radius" of the ellipsoidal confidence region. With probability $1-\alpha$ (over independent samples of $n$ data points), this ellipsoid will contain the true mean. I'm not sure that answers your question, though. How is your $\mu_i$ vector computed? – Tom Dietterich Apr 15 '22 at 20:31