How to set up a mixed model with partially crossed factor levels

Question

I would need some help with the analysis of a soil data set. The variable Ms was measured with n=5 over several campaigns (7 levels; i, ii, iii...) throughout a year on an agricultural field that received different treatments (4 levels; T1, T2, T3, T4). First, I assumed that I could simply do a two-factor ANOVA with Treatment and Campaign as fixed effects, like

mod1 <- lm(Ms ~ Treatment*Campaign, data = mydata)

However, my problem is that Ms was not measured on each treatment for every campaign and thus I do have not all factor combinations available for analysis. I wanted to analyse the differences in Ms between the treatments and within the treatments with different campaigns (with time). So what I understand is that I should rather set up a linear mixed effects model with lmer. So here would be my next attempt with

Treatment as fixed factor
Campaign (time) as fixed factor
Treatment*Campaign to account for their interaction

But how do I account for the fact that I do not have all factor combinations (Treatment*Campaign) available? Should treatment be nested within campaign? My first try would be this

mod2 <- lmer(y ~ Treatment * Campaign + (1 + Campaign | Treatment), data = mydata)

However, that gives me the following error message:

fixed-effect model matrix is rank deficient so dropping 4 columns / coefficients
Warning messages:
1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
   unable to evaluate scaled gradient
2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
   Model failed to converge: degenerate  Hessian with 2 negative eigenvalues

Any hints on whether or not I am on the right track or how to get there would be highly appreciated. Thank you.

EDIT Here is what the data looks like:

mydata <- data.frame(
Measured  = c(NA, NA, NA, NA, NA, 2.8899, 2.7774, 2.8555, 2.6998, 2.3802,
              3.3401, 2.0813, 3.4819, 2.8482, 3.1023, 2.8645, 3.6187, 3.1503, 3.9042, 2.2517,
              2.0234, 2.0711, 2.1987, 3.0894, 2.3333, 2.3962, 2.2695, 2.7324, 3.4932, 3.4137,
              3.5004, 3.3793, 3.5314, 2.9805, 2.4942, 3.8812, 3.5276, 3.4728, 2.8531, 2.9269, 
              2.9238, 3.4932, 2.9542, 2.8357, 3.0318, 3.4603, 2.1142, 2.1777, 1.1841, 3.2934, 
              1.3890, 3.4664, 2.8035, 2.8407, 2.6637, 2.6454, 2.3075, 2.5899, 3.7853, 2.8899,
              3.1148, 2.7364, 3.0111, 2.6780, 3.2121, 3.6997, 2.4121, 3.2405, 2.1547, 3.3692,
              2.1294, 2.6294, 2.6702, 2.8910, 3.0528, NA, NA, NA, NA, NA,
              2.6776, 2.6405, 2.7419, 3.3820, 2.4099, 2.6637, 2.8549, 2.8195, 2.9180, 2.8704,
              NA, NA, NA, NA, NA, 2.6157, 3.4138, 1.1507, 2.0173, 2.0885), 
Campaign  =   c(rep(c("i", "ii", "iii", "iv", "v"), each = 20)),
Treatment =   c(rep(c("T1", "T2", "T3", "T4", "T1", "T2", "T3", "T4", "T1", "T2", "T3", "T4",
              "T1", "T2", "T3", "T4", "T1", "T2", "T3", "T4"), each = 5)))

Your lmer call does not make sense because you include Treatment as both a fixed effect and a random effect. You could do Treatment+Campaign+(1|Treament:Campaign): this will take care of the dependencies in the data but only estimate Treatment and Campaign main effects; you cannot really estimate the interaction effects because you don't have the full data. Alternative would be to code Campaign as continuous (1,2,3,4,5) and include polynomial or spline effects for it; then you can have the interaction back, e.g.: Treatment * poly(Campaign,2) + (1|Treatment:Campaign). — amoeba, May 30 '17 at 11:28
I'll take more of a look at this in the morning, but we had a discussion a few weeks back on what is considered a cluster variable vs. what is considered something to use as a fixed effect. It seems like OP is not sure which to use—I think a lot of it boils down to what they are trying to test theoretically/practically. Here's the link to the discussion: https://stats.stackexchange.com/questions/275450/when-to-use-mixed-effect-model — Mark White, May 31 '17 at 02:43

score 1 · Answer 1 · answered May 29 '17 at 13:07

1

I think the easiest solution is doing a ANOVA for nested data.

Without any data examples it's hard to guess which solution suits your problem. But you might want to try something like:

aov(y ~ Treatment + Campaign %in% Treatment, data = mydata)

answered May 29 '17 at 13:07

Gregor Ma

89

Thank you Gregor for your reply! I added a data example. I hope that gives you more of an idea what my problem is. – the-pore May 29 '17 at 15:56
I tried your proposed code. It gives me the warning "4 out of 24 effects not estimable. Estimated effects may be unbalanced. 50 observations deleted due to missingness." So I guess there is still a problem with the unbalanced design due to missing factor combinations. How can I tackle that? – the-pore May 29 '17 at 16:07
1

I don't think this is a meaningful approach given the data posted in the OP. – amoeba May 30 '17 at 11:09

How to set up a mixed model with partially crossed factor levels

1 Answers1