Comparison between MAD and SD

Question

I am reading Huber's Robust Statistics (2nd). On page 2 and 3 he gave an example. The basic facts are summarized here. Let $(X_n)$ be a sequence of random variables and define two measures of spread as follows.

Mean Absolute Deviation: $d_n := \frac{1}{n}\sum|x_i-\bar x|$.
Standard Deviation: $s_n := \sqrt{\frac{1}{n}\sum (x_i-\bar x)^2}$.

Then he mentioned that Fisher claimed that for identically distributed normal observations $s_n$ is about 12% more efficient than $d_n$. In addition, $s_n$ converges to $\sigma$ while $d_n$ converges to $\sigma\sqrt{2/\pi}\doteq 0.8\sigma$. I have several questions about these statements.

How to prove that $s_n$ is 12% more efficient, please? As least where to find the proof, please?
How to prove that $d_n$ converges to $\sigma\sqrt{2/\pi}\doteq 0.8\sigma$, please? Again at least where to find the proof, please?
I did some simulation to test all the above statements. Here are the codes and outcome.

n <- 10000 # number of samples
x <- array(list(), n)
set.seed(2014)
for(i in 1:n){
  x[[i]] <- rnorm(10000) # the 10000 here is the size of each sample
}
dn <- rep(0, n) # mad
sn <- rep(0, n) # sd
for(i in 1:1000){
  dn[i] <- mean(abs(x[[i]]-mean(x[[i]]))) # mad
  sn[i] <- sqrt(var(x[[i]])*999/1000) # sd
}
mean(dn) # 0.07979068 check out
mean(sn) # 0.09995901 check out
var(dn)/var(sn) # 0.6371817

As the above simulation shows, the 12% efficiency of $s_n$ does not check out. Why is this the case, please? Did I make errors in my simulation, please? Thank you!

You have for(i in 1:1000) but it should be for(i in 1:n) when you are calculating $d_n$ and $s_n$!!! Note that mean(dn) and mean(sn) are off by a factor of 10! — guy, Aug 16 '14 at 18:06
Also note that, for comparison purposes, var(dn) should be replaced by pi / 2 * var(dn) since $\sqrt{\pi / 2} d_n$ is your consistent estimator of $\sigma$. — guy, Aug 16 '14 at 18:15

score 3 · Accepted Answer · answered Aug 16 '14 at 15:25

Let $X$ be a $N(\mu, \sigma^2)$ random variable, and we have a size-$n$ sample of i.i.d. realizations from it. Then for the random variable

$$\tilde X_i= X_i - \bar X_n = \left(1-\frac 1n\right)X_i - \frac1n\sum_{j\neq i}X_j$$

we have that it is also normal, and that

$$E(\tilde X_i) = \frac {n-1}n\mu - \frac {n-1}n\mu =0,\;\; \\ \operatorname{Var}(\tilde X_i) \equiv \sigma^2_c= \left(\frac {n-1}{n}\right)^2 \sigma^2 + \frac {n-1}{n^2}\sigma^2 = \frac {n(n-1)}{n^2}\sigma^2$$

Then the variable $|\tilde X_i|$ is a half-normal distribution with

$$E\left(|\tilde X_i|\right) = \sigma_c\sqrt {\frac 2{\pi}} = \sigma\frac {n(n-1)}{n^2}\sqrt {\frac 2{\pi}} \\ \operatorname{Var}(|\tilde X_i|) = \sigma^2_c\left(1-\frac 2{\pi}\right) = \sigma^2\frac {n(n-1)}{n^2}\left(1-\frac 2{\pi}\right)$$

Therefore the quantity

$$d_n = \frac{1}{n}\sum_{i=1}^n|X_i-\bar X_n| = \frac{1}{n}\sum_{i=1}^n|\tilde X_i| \xrightarrow{p}\frac{1}{n}\sum_{i=1}^nE|\tilde X_i| =\sigma\sqrt {\frac 2{\pi}}$$

by the Weak Law of Large Numbers, and since the term $\frac {n(n-1)}{n^2}$ tends to unity as $n$ tends to infinity. Note that the $|\tilde X_i|$ variables are not independent, but Markov's LLN covers also dependent random variables as long as they are "asymptotically uncorrelated" meaning that the Variance of $d_n$ should go to zero asymptotically. And it does -intuitively and informally, as $n$ increases the source of dependence, $\bar X_n$, becomes more and more "independent" of its components as it converges to the constant $\mu$, and the correlation between the $|\tilde X_i|$'s vanishes.
Note also that the above result does not mean that $d_n$ is inconsistent -because $d_n$ does not attempt to measure the standard deviation in the first place.

As for the matter of relative asymptotic efficiency, in the reference you mention, and in page 3, there exists equation $(1.5)$, which shows you how to calculate this relative efficiency measure (setting the $\varepsilon$ equal to zero, since you do not have a contaminated sample). You have written the variances upside-down in the quotient, and more over, not only the variances take part in this measure.

Thank you for these detailed discussion. In addition, could you show me how to get the formula $(1.5)$ please? Especially the second equality. — LaTeXFan, Aug 17 '14 at 01:04
Set $\varepsilon =0$ in the rightmost expression to find its numerical value. To verify it, (I don't really understand why Huber does not use simply the ratio of variances): The variance of $d_n$, I have already provided. For the variance of $s_n$, in the post http://stats.stackexchange.com/questions/105337/asymptotic-distribution-of-sample-variance-of-non-normal-sample, you will find the asymptotic variance of the sample variance. Then a) check the wikipedia page to find to what $\mu_4$ is equal to when the distribution is normal, and b) apply the Delta method to find the AsVar for $s_n$. — Alecos Papadopoulos, Aug 17 '14 at 01:23
Thanks again. About the definition of ARE in Huber's book. I suppose that the two expectations will cancel with each other in the limit since both are supposed to be consistent. Right? And this is also a bit confusing because in his definition $d_n$ is not consistent. — LaTeXFan, Aug 17 '14 at 08:08

Comparison between MAD and SD

1 Answers1