1

at the moment I have to decide between a random-effect or a fixed-effect model with a binary dependent variable y. The best way to decide between these two models is normally to use the Hausman-Test (e.g. Green 2012). My problem is first, that I don't understand the exact model specifications and second, that I'm not sure if my approach is the right one.

My models look like this: Mixed Effects Model

ModelME<-glmer(populist ~ wkb+ married + age + I(age^2)+(1|pid),
              data = phi4,
              family = binomial(link = 'logit'),nAGQ=10)

and Fixed Effects Model

 ModelFE<-glm(populist ~ wkb+ married + age + I(age^2)+pid,
                data = phi4,
                family = binomial(link = 'logit'))

and to specify the Hausman-Test I use the code of another question here on stack overflow (https://stackoverflow.com/questions/23630214/hausmans-specification-test-for-glmer-from-lme4).

and then use the test:

phtest_glmer(logitD,logitFE)

with this result:

Hausman Test

data:  phi4
chisq = 3234.6, df = 5, p-value < 2.2e-16
alternative hypothesis: one model is inconsistent

So my questions are: 1. Describes (1|pid) the random effect? 2. Is ModelFE even a Fixed Effects Model? 3. Is the result of the Hausman-Test in any way correct? 4. Should I use another package to use the Hausman-Test?

Any help would be appreciated! Thank you. And please tell me, if you need more information!

Best regards,

1 Answers1

2

I guess pid is a patient or person identifier, so you have repeated measures from the same persons. If so, I would include time as well in your model, e.g.

glmer(populist ~ wkb + married + age + I(age^2) + time + (1 + time | pid)

Regarding your questions:

  1. yes, (1 | pid) describes the random part of your model. You have a varying intercept (1) for each pid, i.e. you assume that your outcome has a different average value for each pid. In repeated measurement / longitudinal design, you may also assume that your outcome varies stronger/weaker for different pid, i.e. you have random slopes as well. That's why I would add time as well.

  2. A "Fixed Effects" Model would require some data preparation, e.g. demeaning and removing the intercept. So from a first quick glance, I would say it does not perfectly describe a FE model.

  3. & 4. I personally would not use the Hausman-test, nor a fixed effects regression at all. There a some good publications showing that the mixed model is in general better than any fixed effects model. You can find examples that summarizes these discussion a bit here and also some information about the issue of correlated group factors and fixed effects here, and how to address these issues using mixed models.

Summary: mixed models can model both between- and within-effects (while FE can only model within-effects), FE models lack information of variation in the group-effects or between-subject effects, FE regression cannot include random slopes (thus neglecting “cross-cluster differences in the effects of lower-level controls (which) reduces the precision of estimated context effects, resulting in unnecessarily wide confidence intervals and low statistical power” (Heisig et al. 2017).

Here are some of the references that I suggest reading:

  • Bafumi J, Gelman A. 2006. Fitting Multilevel Models When Predictors and Group Effects Correlate. In. Philadelphia, PA: Annual meeting of the American Political Science Association.

  • Bell A, Fairbrother M, Jones K. 2018. Fixed and Random Effects Models: Making an Informed Choice. Quality & Quantity.

  • Bell A, Jones K. 2015. Explaining Fixed Effects: Random Effects Modeling of Time-Series Cross-Sectional and Panel Data. Political Science Research and Methods, 3(1), 133–153.

  • Gelman, Andrew, and Jennifer Hill. 2007. Data Analysis Using Regression and Multilevel/Hierarchical Models. Analytical Methods for Social Research. Cambridge ; New York: Cambridge University Press (in particular chapter 12.6)

  • Heisig JP, Schaeffer M, Giesecke J. 2017. The Costs of Simplicity: Why Multilevel Models May Benefit from Accounting for Cross-Cluster Differences in the Effects of Controls. American Sociological Review 82 (4): 796–827.

Daniel
  • 1,385