I would need some help with the analysis of a soil data set. The variable Ms was measured with n=5 over several campaigns (7 levels; i, ii, iii...) throughout a year on an agricultural field that received different treatments (4 levels; T1, T2, T3, T4). First, I assumed that I could simply do a two-factor ANOVA with Treatment and Campaign as fixed effects, like
mod1 <- lm(Ms ~ Treatment*Campaign, data = mydata)
However, my problem is that Ms was not measured on each treatment for every campaign and thus I do have not all factor combinations available for analysis. I wanted to analyse the differences in Ms between the treatments and within the treatments with different campaigns (with time). So what I understand is that I should rather set up a linear mixed effects model with lmer. So here would be my next attempt with
- Treatment as fixed factor
- Campaign (time) as fixed factor
- Treatment*Campaign to account for their interaction
But how do I account for the fact that I do not have all factor combinations (Treatment*Campaign) available? Should treatment be nested within campaign? My first try would be this
mod2 <- lmer(y ~ Treatment * Campaign + (1 + Campaign | Treatment), data = mydata)
However, that gives me the following error message:
fixed-effect model matrix is rank deficient so dropping 4 columns / coefficients
Warning messages:
1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
unable to evaluate scaled gradient
2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge: degenerate Hessian with 2 negative eigenvalues
Any hints on whether or not I am on the right track or how to get there would be highly appreciated. Thank you.
EDIT Here is what the data looks like:
mydata <- data.frame(
Measured = c(NA, NA, NA, NA, NA, 2.8899, 2.7774, 2.8555, 2.6998, 2.3802,
3.3401, 2.0813, 3.4819, 2.8482, 3.1023, 2.8645, 3.6187, 3.1503, 3.9042, 2.2517,
2.0234, 2.0711, 2.1987, 3.0894, 2.3333, 2.3962, 2.2695, 2.7324, 3.4932, 3.4137,
3.5004, 3.3793, 3.5314, 2.9805, 2.4942, 3.8812, 3.5276, 3.4728, 2.8531, 2.9269,
2.9238, 3.4932, 2.9542, 2.8357, 3.0318, 3.4603, 2.1142, 2.1777, 1.1841, 3.2934,
1.3890, 3.4664, 2.8035, 2.8407, 2.6637, 2.6454, 2.3075, 2.5899, 3.7853, 2.8899,
3.1148, 2.7364, 3.0111, 2.6780, 3.2121, 3.6997, 2.4121, 3.2405, 2.1547, 3.3692,
2.1294, 2.6294, 2.6702, 2.8910, 3.0528, NA, NA, NA, NA, NA,
2.6776, 2.6405, 2.7419, 3.3820, 2.4099, 2.6637, 2.8549, 2.8195, 2.9180, 2.8704,
NA, NA, NA, NA, NA, 2.6157, 3.4138, 1.1507, 2.0173, 2.0885),
Campaign = c(rep(c("i", "ii", "iii", "iv", "v"), each = 20)),
Treatment = c(rep(c("T1", "T2", "T3", "T4", "T1", "T2", "T3", "T4", "T1", "T2", "T3", "T4",
"T1", "T2", "T3", "T4", "T1", "T2", "T3", "T4"), each = 5)))
lmercall does not make sense because you includeTreatmentas both a fixed effect and a random effect. You could doTreatment+Campaign+(1|Treament:Campaign): this will take care of the dependencies in the data but only estimate Treatment and Campaign main effects; you cannot really estimate the interaction effects because you don't have the full data. Alternative would be to code Campaign as continuous (1,2,3,4,5) and include polynomial or spline effects for it; then you can have the interaction back, e.g.:Treatment * poly(Campaign,2) + (1|Treatment:Campaign). – amoeba May 30 '17 at 11:28