I am analysing a longitudinal dataset of drug use in R code. Which contain the dependent variable: use measured twice which I want to regress on several independent variables, to see if they can predict changes in use over the two timepoints.
The 3 time-invariant independent variables are age, sex and ´bias´, which is a measure of implicit cognitive bias to the drug. I have a hypothesis that bias can predict changes in drug use.
In assessing the need for a multilevel model I have used the book: Applied longitudinal data analysis by Singer & Willet (avaliable here and : relevant R code). Here the authors suggest that before I regress on predictors I assess "whether there is hope for future analyses" by fiting two unconditional models that: partition and quantify the outcome variation in two important ways: first, across people without regard to time (the unconditional means model), and second, across both people and time (the unconditional growth model).As I understand it, the second model should be a better fit if there is considerable systematic variation in your dependent variable that is worth exploring with predictors.
I fit these models using the R package:nlme like so :
model.a <- lme(use ~ 1, random= ~1|id, data = df)
model.b <‐ lme(use ~ time, random= ~time|id, data= df)
In Singer & Willet's example data their second model.b provides a drop in level-1 residual deviance and their AIC & BIC also drops compared to model.a, indicating a better fit: shown here in their R code.
This is in contrast to my models where AIC & BIC increases
> anova(model.a,model.b)
Model df AIC BIC logLik Test L.Ratio p-value
model.a 1 3 1058.395 1068.320 -526.1975
model.b 2 6 1064.356 1084.176 -526.1781 1 vs 2 0.03872928 0.998
As I understand Singer & Willet it is arguable inappropriate to model this using multilevel models, because of the decrease in fit, but I am not sure I understand why.
Question 1: Why is it inappropriate to model this using multilevel modelling?
Question 2: Is the second model not a better fit because I only have two timepoints for the dependent variable in my dataset, and Singer & Willet example has three timepoints?.
The most relevant CV thread I've found is this: Under what conditions should one use multilevel/hierarchical analysis?, But I have not found any answers that satistify this specific case.
Thanks for Reading!
biashas an effect givenageandsex, then why not putting all of that into the model and seeing what comes out? It's not clear what would be the benefit of the two-stage procedure, where you first only consider a model withtime. Even worse, if e.g. depending onbiasthe effect oftimecan be either positive or negative, then you will never discover that becausetimealone would appear insignificant. – amoeba Jun 12 '17 at 08:31