How to find variance between multidimensional points?

Question

Suppose I have a matrix X which is n by p, i.e. it has n observations, with each observation in p-dimensional space.

How do I find the variance of these n observations?

In the case where p = 1, I just need to use the regular variance formula. What about the cases where p > 1.

Do you know how to accept an answer? – qqqqq May 10 '23 at 14:23 — qqqqq, May 10 '23 at 14:23

Philipp Burckhardt · Answer 1 · 2013-02-07T19:17:19.780

For a $p$-dimensional random variable $X = {\left( {{X_1}, \ldots ,{X_p}} \right)^\intercal}$, we have the following definition of the variance:

$$Var\left( X \right) = E\left[ {\left( {X - EX} \right){{\left( {X - EX} \right)}^\intercal}} \right] = \left( {\begin{array}{*{20}{c}} {Var\left( {{X_1}} \right)}& \ldots &{Cov\left( {{X_1},{X_p}} \right)} \\ \vdots & \ddots & \vdots \\ {Cov\left( {{X_p},{X_1}} \right)}& \ldots &{Var\left( {{X_p}} \right)} \end{array}} \right)$$

That is, the variance of a random vector is defined as the matrix which stores all the variances on the main diagonal and the covariances between the different components in the other elements. The sample $p \times p$ covariance matrix would then be calculated by plugging in the sample analogs for the population variables:

$$\frac{1}{{n - 1}}\left( {\begin{array}{*{20}{c}} {\sum\limits_{i = 1}^n {{{\left( {{X_{i1}} - {{\bar X}_{\cdot1}}} \right)}^2}} }& \ldots &{\sum\limits_{i = 1}^n {\left( {{X_{i1}} - {{\bar X}_{\cdot1}}} \right)\left( {{X_{ip}} - {{\bar X}_{\cdot p}}} \right)} } \\ \vdots & \ddots & \vdots \\ {\sum\limits_{i = 1}^n {\left( {{X_{ip}} - {{\bar X}_{\cdot p}}} \right)\left( {{X_{i1}} - {{\bar X}_{\cdot 1}}} \right)} }& \ldots &{\sum\limits_{i = 1}^n {{{\left( {{X_{ip}} - {{\bar X}_{\cdot p}}} \right)}^2}} } \end{array}} \right)$$ where ${X_{ij}}$ denotes the $i$th observation for feature $j$ and ${{\bar X}_{ \cdot j}}$ the sample mean of the $j$th feature. To sum up, the variance of a random vector is defined as the matrix containing the individual variances and covariances. It therefore suffices to calculate the sample variances and covariances for all the vector components individually.

In principle component analysis, I heard a talk about a given variable accounting for a given fraction of the variance. This implies some scalar measure of the variance. How to get from a covariance matrix to a scalar? — martinkunev, Feb 09 '23 at 16:02

How to find variance between multidimensional points?

1 Answers1

Linked