Simulating and estimating a mixed effects model: poor estimates of time fixed effects

Question

I am simulating data from a basic mixed effects model $$ y_{it}=\alpha_i+\beta_t+\gamma x_{it}+\varepsilon_{it}. $$ I then estimate a corresponding fixed effects model on the simulated data. I get good estimates for individual fixed effects but poor estimates for time fixed effects. Why could that be?

(Actually, the estimates of fixed effects are all biased. I am not worried about that as I think the location is not identified.)

R code:

#---------- Simulation
Set the number of individuals and the number of time periods:
n=100; T=100
Generate parameter values:
set.seed(1); alphas=runif(n=n,min=-1,max=1)
set.seed(2); betas =runif(n=T,min=-1,max=1)
             gamma =1
Generate covariate x and error term eps
set.seed(3); x     =rnorm(n=nT); X  =matrix(x  ,nrow=n,byrow=TRUE)
set.seed(4); eps   =rnorm(n=nT); Eps=matrix(eps,nrow=n,byrow=TRUE)
Obtain the dependent variable y
Y=matrix(NA,ncol=T,nrow=n)
for(i in 1:n){
 for(t in 1:T){
  Y[i,t]=alphas[i]+betas[t]+gamma*X[i,t]+Eps[i,t]
 }
}
y=c(t(Y))
Create identifiers of individuals (obj) and time periods (time)
obj =c(1,rep(0,T-1)); obj =rep(obj,n); obj =cumsum(obj)
time=c(1:T); time=rep(time,n)
#---------- Fixed effects estimation
Estimate by accounting for the individual heterogeneity using a FE model
m3=lm(y~-1+factor(obj)+factor(time)+x)
summary(m3)
Extract estimates of individual and time fixed effects
alphas_hat3=m3$coef[1:n]
betas_hat3 =m3$coef[(n+1):(n+T)]
cor(alphas,alphas_hat3)^2
cor(betas, betas_hat3 )^2
Plot them against the true parameter values
dev.new(); mar1=c(4,4,3,0.5); par(mfrow=c(2,1),mar=mar1)
 plot(x=alphas,y=alphas_hat3,xlab="true",ylab="fitted",main="Individual fixed effects"); abline(a=0,b=1)
 plot(x=betas ,y=betas_hat3 ,xlab="true",ylab="fitted",main="Time fixed effects"      ); abline(a=0,b=1)
par(mfrow=c(1,1))

Related: https://stats.stackexchange.com/questions/593605 – Richard Hardy Oct 26 '22 at 14:21 — Richard Hardy, Oct 26 '22 at 14:21

statmerkur · Accepted Answer · 2022-10-26T23:07:51.837

In theory, the columns of the design matrix of m3 would be linearly dependent since the sum of the column vectors identifying the objects and the sum of the the column vectors identifying the time points both equal $\mathbf{1}_{n\cdot T}$. However, lm calls model.matrix internally, which in turn drops one column$-$in this case the one indicating time point 1$-$to generate a design matrix of full column rank. But this means that m3 contains the estimated coefficients of a model without $\beta_1$ (i.e., in which $\beta_1$ is forced to equal zero). The new parameters in terms of the original parameters are given by $$ \tilde{\alpha}_i = \alpha_i + \beta_1 \;\;\forall\, i \in \left\{1,\ldots,n\right\},\\ \tilde{\beta}_t = \beta_t - \beta_1 \;\;\forall\, t \in \left\{2,\ldots,T\right\}. $$ One could therefore consider something like

alphas_hat3 <- m3$coef[1:n] - betas[1]
betas_hat3 <- m3$coef[(n+1):(n+T-1)] + betas[1]
betas <- betas[-1]

which, with your seeds, yields

cor(alphas, alphas_hat3)^2
# 0.984594
cor(betas, betas_hat3)^2
# 0.98608

and the following plot

Other seeds also generate plots in which the estimated coefficients are more evenly spread around the identity line.

Perfect, thank you! If you have time, perhaps you could also help me with a related one: https://stats.stackexchange.com/questions/593605? — Richard Hardy, Oct 27 '22 at 08:54

Simulating and estimating a mixed effects model: poor estimates of time fixed effects

Set the number of individuals and the number of time periods:

Generate parameter values:

Generate covariate x and error term eps

Obtain the dependent variable y

Create identifiers of individuals (obj) and time periods (time)

Estimate by accounting for the individual heterogeneity using a FE model

Extract estimates of individual and time fixed effects

Plot them against the true parameter values

1 Answers1

Linked