0

I tried to simulate some stuff around the Gauss-Markov-Theorem when I stumbled over something I cannot explain. I tried to simulate the density of an OLS estimator and a LAD (least absolute deviations) estimator. Here's my R code:

library(L1pack) # for LAD estimation
library(tidyverse)

n <- 100 x <- seq(from = 0, to = 10, length.out = n) n_runs <- 10000 beta <- cbind(numeric(n_runs), numeric(n_runs)) for(i in 1:n_runs){ u <- rnorm(n, mean = 0, sd = 3) # normal distribution #u <- scale(rexp(n, rate = .2), scale = FALSE) # exponential distribution, mean centered #u <- rt(n, 2) # t-distribution y <- x + u

R_OLS <- lm(y ~ x) beta[i, 1] <- R_OLS$coefficients[2]

R_LAD <- lad(y ~ x) beta[i, 2] <- R_LAD$coefficients[2] }

beta %>% data.frame() %>% rename("OLS" = X1, "LAD" = X2) %>% pivot_longer(cols = OLS:LAD) %>% ggplot(aes(x = value, color = name))+ geom_density()+ geom_vline(xintercept = 1, linetype = "dashed")+ coord_cartesian(xlim = c(.5,1.5))+ theme_bw()

enter image description here The result is what I was expecting: The OLS estimator has a smaller variance and is therefore "better".

However, when I don't use normally distributed error terms when generating my data (see commented out lines in code), things don't turn out as I would have expected: enter image description here

Now OLS doesn't perform better, which to me looks like a violation of the BLUE (best linear unbiased estimator) principle.

Whats going on here?

0 Answers0