I'm modeling a dataset with multiple years in period 1 and a single year in period 2. The main question is whether period 1 differs from period 2. The years are therefore modeled as random.
However, the output I'm getting surprised me, since it looks like the random effects estimated are not as variable as they seem from the data. Is the single random level in the period 2 reducing the estimated random variability? If so - is there a better way of modeling these data?
toy <- structure(list(Period = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L), .Label = c("Period1", "Period2"), class = "factor"),
Value = c(5332.81481481481, 2711.11111111111, 1600, 622.222222222222,
1198, 2190, 2043, 2345, 1957, 1138, 474, 1328, 1647, 1155,
1672, 2000, 5113, 2018, 2650, 9512, 793, 50, 550, 3900, 3215,
4900, 5075, 3000, 6100, 8000, 4000), YearFac = structure(c(1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L
), .Label = c("2006", "2013", "2017", "2018", "2019"), class = "factor")),
row.names = c(NA,
-31L), class = c("tbl_df", "tbl", "data.frame"), .Names = c("Period",
"Value", "YearFac"))
Analysis
library(plyr)
library(nlme)
library(ggplot2)
mod <- lme(Value ~ Period, random = ~1|YearFac, data = toy, method = "ML")
toy$Pred <- predict(mod, level = 0)
toy$Pred.random <- predict(mod)
means <- ddply(toy, .(Period, YearFac), summarize, Mean = mean(Value))
ggplot(toy) +
geom_point(aes(x = Period, y = Value, colour = YearFac),
position = position_dodge(width = 0.4)) +
geom_point(aes(x = Period, y = Pred, shape = "Pop-level prediction"), size = 3) +
geom_point(aes(x = Period, y = Pred.random, shape = "Random prediction", group = YearFac),
position = position_dodge(width = 0.4), size = 2) +
geom_point(data = means, aes(x = Period, y = Mean,
group = YearFac, fill = "mean"), shape = 21, size = 2,
position = position_dodge(width = 0.4)) +
theme_bw()
"The cross tabulation shows that each level of class occurs only in one level of school, as per your definition of nesting. Both model formulations will now produce the same output"– user2602640 Dec 08 '17 at 16:55