I tried to simulate some stuff around the Gauss-Markov-Theorem when I stumbled over something I cannot explain. I tried to simulate the density of an OLS estimator and a LAD (least absolute deviations) estimator. Here's my R code:
library(L1pack) # for LAD estimation
library(tidyverse)
n <- 100
x <- seq(from = 0, to = 10, length.out = n)
n_runs <- 10000
beta <- cbind(numeric(n_runs), numeric(n_runs))
for(i in 1:n_runs){
u <- rnorm(n, mean = 0, sd = 3) # normal distribution
#u <- scale(rexp(n, rate = .2), scale = FALSE) # exponential distribution, mean centered
#u <- rt(n, 2) # t-distribution
y <- x + u
R_OLS <- lm(y ~ x)
beta[i, 1] <- R_OLS$coefficients[2]
R_LAD <- lad(y ~ x)
beta[i, 2] <- R_LAD$coefficients[2]
}
beta %>%
data.frame() %>%
rename("OLS" = X1, "LAD" = X2) %>%
pivot_longer(cols = OLS:LAD) %>%
ggplot(aes(x = value, color = name))+
geom_density()+
geom_vline(xintercept = 1, linetype = "dashed")+
coord_cartesian(xlim = c(.5,1.5))+
theme_bw()
The result is what I was expecting: The OLS estimator has a smaller variance and is therefore "better".
However, when I don't use normally distributed error terms when generating my data (see commented out lines in code), things don't turn out as I would have expected:

Now OLS doesn't perform better, which to me looks like a violation of the BLUE (best linear unbiased estimator) principle.
Whats going on here?