Can we calculate the variance without using the mean as the 'base' point?
Asked
Active
Viewed 5,066 times
7
-
3Given $\mathbb{E}(X^2)<\infty$, the variance is given by $\sigma^2 = \mathbb{E}((X-\mathbb{E}(X))^2)$ by definition. The formular simplifies to $\sigma^2 =\mathbb{E}(X^2) - \mathbb{E}(X)^2$. I.e., for the variance you need $\mathbb{E}(X)$. Of course you could define your own dispersion measure using some other statistic...or use one from the answers. – BloXX Mar 26 '19 at 07:56
-
5Short answer: Lots of other ways to summarize variability (dispersion, spread, scale) but none of the others would be the variance. (In fact, the variance can be defined without reference to the mean.) – Nick Cox Mar 26 '19 at 08:28
-
3Yes: given data $X,$ compute the covariance of $(X,X)$ as described at https://stats.stackexchange.com/a/18200/919. This method never computes the mean. – whuber Mar 26 '19 at 13:15
2 Answers
13
The median absolute deviation is defined as $$\text{MAD}(X) = \text{median} |X-\text{median}(X)|$$ and is considered an alternative to the standard deviation. But this is not the variance. In particular, it always exists, whether or not $X$ allows for moments. For instance, the MAD of a standard Cauchy is equal to one since $$\underbrace{\Bbb P(|X-0|<1)}_\text{0 is the median}=\arctan(1)/\pi-\arctan(-1)/\pi=\frac{1}{2}$$
brazofuerte
- 987
Xi'an
- 105,342
-
8Newcomers to this idea should watch out also for mean absolute deviation from the mean (mean deviation, often) and median absolute deviation from the mean. I don't recall mean absolute deviation from the median, but am open to examples. The abbreviation MAD, unfortunately, has been applied variously, so trust people's code first, then their algebraic or verbal definition, but use of an abbreviation MAD only not at all. In symmetric distributions, and some others, MAD as defined here is half the interquartile range. (Punning on MAD I resist as a little too obvious.) – Nick Cox Mar 26 '19 at 08:23
-
3Also, note that software implementations of the median absolute deviation function can scale the MAD value by a constant factor from the form presented in this answer, so that its value coincides with the standard deviation for a normal distribution. – EdM Mar 26 '19 at 08:30
-
@EdM Excellent point. Personally I dislike that practice unless people use some different term. It's no longer the MAD! – Nick Cox Mar 26 '19 at 08:35
-
1@NickCox: the appeal of centring on the median is that the quantity always exists, whether or not the distribution enjoys a mean. This is the definition found in Wikipedia. – Xi'an Mar 26 '19 at 09:20
-
-
-
4
There is already a solution for this question on Math.stackexchange:
I summarize the answers:
- You can use that the variance is $\overline{x^2} - \overline {x}^2$, which takes only one pass (computing the mean and the mean of the squares simultaneously), but can be more prone to roundoff error if the variance is small compared with the mean.
- How about sum of squared pairwise differences ? Indeed, you can check by direct computation that
$$ 2v_X = \frac{1}{n(n-1)}\sum_{1 \le i < j \le n}(x_i - x_j)^2. $$
- The sample variance without mean is calculated as: $$ v_{X}=\frac{1}{n-1}\left [ \sum_{i=1}^{n}x_{i}^{2}-\frac{1}{n}\left ( \sum_{i=1}^{n}x_{i} \right ) ^{2}\right ] $$
Ferdi
- 5,179