6

I am working on a paper, and have a problem regarding ellipsoid and Trivariate Normal Distribution. Suprisingly I can't find much in literature but I found your in one of your answers:

Because this construction has nothing to do with "confidence" per se, the objective is to establish some convention for describing the shape and relative size of the points. Using 1.96 sort of works (for three variables): it contains about 72% of the probability of the trivariate normal distribution. But as the number of variables increases this method produces ellipses that are far too small. For instance, with 10 variables it will contain only 4.6% of the probability; using 4.28 instead of 1.96 in this case will contain 95% of the probability.

How did you get this number 72%? Or do you have some literature to recommend to me in which I can find this. I would appreciate it very much!

Randel
  • 6,711
  • For starters go to http://mathworld.wolfram.com/TrivariateNormalDistribution.html -it contains also literature references – Alecos Papadopoulos Sep 06 '13 at 07:31
  • Which answer are you quoting? – Glen_b Sep 06 '13 at 09:07
  • 1
    @Glen_b The comment at http://stats.stackexchange.com/questions/67422/volume-of-the-95-confidence-ellipsoid/67429#comment130300_67429. The question starting that thread explains the chi-squared origin of these numbers and contains a couple of links. – whuber Sep 06 '13 at 15:52

1 Answers1

6

If $X \sim N_k(\mu,\Sigma)$, then $Q=(X-\mu)'\Sigma^{-1}(X-\mu) \sim \chi^2_k$. Further, the level sets of $Q$ are the ellipsoids you refer to. So the 72% you mention comes from a chi-square distribution (these calcs in R):

> pchisq(1.96^2,df=3)
[1] 0.7209157

As do the other numbers:

> pchisq(1.96^2,df=10)
[1] 0.04579014

> sqrt(qchisq(0.95,df=10))
[1] 4.278672

See http://en.wikipedia.org/wiki/Multivariate_normal_distribution#Prediction_Interval

Glen_b
  • 282,281
  • We got 95% confidence ellipsoid for 3 principal components by PCA on the first sample, and wanted to check it with next smaller sample and got the result that 72% of cases are found inside of ellipsoid. Are we right that explanation lies in the fact that ellipsoid contains 72% of the probability of the trivariate normal distribution? – user29976 Sep 07 '13 at 19:14
  • I'm sorry I didn't quite follow your explanation there. Could you clarify either your explanation of what you did or rephrase your question more precisely into a form that doesn't require an explanation of what you did at all? – Glen_b Sep 07 '13 at 23:48
  • Ok. This is the problem. I have three variables (say X,Y and Z) and their measurements from sample (X=[x1,x2,...xn],Y=[y1,y2,...yn],Z=[z1,z2,...zn]). On the basis of those measurements I want to define 95% confidence ellipsoid (in X,Y,Z coordinate space), so that I can infer that 95% of whole population (from which I took my sample) will fit in that ellipsoid. – user29976 Sep 08 '13 at 09:14
  • Oh, that's a very different problem to the one you posted about; here you're also dealing with the fact that the parameter estimates of $\mu$ and $\Sigma$ are not the same as the population quantities, but are themselves random variables. No hint of this problem is present in the original question. You might like to post a new question discussing the issue you have here. – Glen_b Sep 08 '13 at 09:21