Is my mixed effect model formula incorrect in R?

Question

I am doing a study that I would like to analyze data with a mixed effect model. I have three fixed effects and one random effect. My code output is seems a bit off and I am unsure if it is a structural problem with the code, or multicoliniarity in my data. I have tried testing fpr coliniarity, but haven't gotten the code to work (I think because I have mostly catagorical data)

Fixed Effects:
soil_type - 2 groups ("Medium" and "Fine")
treatment - 4 groups ("L", "S", "K", "C")
days - 3 sampling points (day 4, 11, 18)
Random Effect:
replicates - 3 groups (1,2,3)

My code is as followed:

cl.mod <- lmer(weighted_cl ~ soil_type + treatment + days + (1|rep),
               data = leach.conc, REML = FALSE)

I believe I have the correct code, but when I look at the output I am unsure if I am missing values:

Fixed effects:
                Estimate Std. Error t value
(Intercept)      7.01992    1.65258   4.248
soil_typeMedium -4.81518    1.05431  -4.567
treatmentKCl    10.48391    1.49103   7.031
treatmentLiquid 25.25578    1.49103  16.939
treatmentSolid   8.31138    1.49103   5.574
days            -0.32534    0.09223  -3.527
Correlation of Fixed Effects:
            (Intr) sl_tyM trtmKC trtmnL trtmnS
soil_typMdm -0.319

treatmntKCl -0.451  0.000

treatmntLqd -0.451  0.000  0.500

treatmntSld -0.451  0.000  0.500  0.500

days        -0.614  0.000  0.000  0.000  0.000*

I am unsure if this is because I have multicoliniarity in my data or if I have my code incorrect (or both). Because this is catagorical data, I am unsure about testing for coliniarity. Also, I have no missing values in my data. Any guidance or things to look at would be greatly appreciated!

Your random effect has three levels? That's a very low number. — Jeremy Miles, Sep 09 '22 at 16:17
@JeremyMiles yes, that is how many replicates we extracted from our plots. Unsure how to increase that number aside from increasing replication our next field season — mbelumn, Sep 09 '22 at 18:09
Consider whether you need a mixed effects model at all, when there are simpler procedures that may address your need: https://stats.stackexchange.com/a/582056/121522 — mkt, Sep 10 '22 at 09:41

Shawn Hemelstrand · Accepted Answer · 2022-09-13T06:06:29.267

Data

So there are a number of questions in your question, and given that your data isn't on hand, I will use my own toy dataset to answer your question. First, the dput of my data in case you want to check yourself:

work <- structure(list(Work_Environment = c("Office", "Office", "Office", 
"Home", "Home", "Office", "Office", "Office", "Office", "Office", 
"Home", "Home", "Office", "Office", "Office", "Home", "Office", 
"Home", "Home", "Office", "Office", "Home", "Office", "Home", 
"Home", "Home", "Office", "Office", "Office", "Office", "Home", 
"Home", "Home", "Office", "Office", "Office", "Office", "Office", 
"Home", "Home", "Office", "Office", "Home", "Home", "Office", 
"Home", "Home", "Office", "Office", "Home", "Home", "Office", 
"Home", "Home", "Office", "Office", "Home", "Office", "Home", 
"Home", "Home", "Home", "Office", "Home", "Office", "Office", 
"Home", "Home", "Office", "Office", "Home", "Home", "Office", 
"Office", "Home", "Office", "Office", "Home", "Office", "Office", 
"Home", "Home", "Office", "Office", "Home", "Home", "Office", 
"Home", "Home", "Office", "Office", "Home", "Office", "Office", 
"Home", "Home", "Office", "Home", "Home", "Home", "Home", "Home", 
"Home", "Office", "Home", "Office", "Office", "Home", "Home", 
"Home", "Office", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Office", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Office", "Home", "Office", "Home", "Office", 
"Office", "Office", "Office", "Home", "Home", "Home", "Home", 
"Office", "Home", "Office", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Office", "Office", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Office", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Office", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Office", "Office", "Office", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Office", "Office", "Home", "Home", 
"Office", "Office", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Office", "Office", "Office", 
"Home", "Office", "Home", "Home", "Home", "Home"), Coffee_Cups = c(3L, 
0L, 2L, 6L, 4L, 5L, 3L, 3L, 2L, 2L, 3L, 1L, 1L, 3L, 2L, 2L, 0L, 
1L, 1L, 4L, 4L, 3L, 0L, 1L, 3L, 0L, 0L, 0L, 0L, 2L, 0L, 1L, 2L, 
3L, 2L, 2L, 4L, 3L, 6L, 6L, 3L, 4L, 6L, 8L, 3L, 5L, 0L, 2L, 2L, 
8L, 6L, 4L, 6L, 4L, 4L, 2L, 6L, 6L, 5L, 1L, 3L, 1L, 5L, 4L, 6L, 
5L, 0L, 6L, 6L, 4L, 4L, 2L, 2L, 6L, 6L, 7L, 3L, 3L, 0L, 5L, 7L, 
6L, 3L, 5L, 3L, 3L, 1L, 9L, 9L, 3L, 3L, 6L, 6L, 6L, 3L, 0L, 7L, 
6L, 6L, 3L, 9L, 3L, 8L, 8L, 3L, 3L, 7L, 6L, 3L, 3L, 3L, 6L, 6L, 
6L, 1L, 9L, 3L, 3L, 2L, 6L, 3L, 6L, 9L, 6L, 8L, 9L, 6L, 6L, 6L, 
0L, 3L, 0L, 3L, 3L, 6L, 3L, 0L, 9L, 3L, 0L, 2L, 0L, 6L, 6L, 6L, 
3L, 6L, 3L, 9L, 3L, 0L, 0L, 6L, 3L, 3L, 3L, 3L, 6L, 0L, 6L, 3L, 
3L, 5L, 5L, 3L, 0L, 6L, 4L, 2L, 0L, 2L, 4L, 0L, 6L, 4L, 4L, 2L, 
2L, 0L, 9L, 6L, 3L, 6L, 6L, 9L, 0L, 6L, 6L, 6L, 6L, 6L, 6L, 3L, 
3L, 0L, 9L, 6L, 3L, 6L, 3L, 6L, 1L, 6L, 6L, 6L, 6L, 6L, 1L, 3L, 
9L, 6L, 3L, 6L, 9L, 3L, 5L, 6L, 3L, 0L, 6L, 3L, 3L, 5L, 0L, 6L, 
3L, 5L, 3L, 0L, 6L, 7L, 3L, 6L, 6L, 6L, 6L, 3L, 5L, 6L, 7L, 6L, 
6L, 4L, 6L, 4L, 5L, 5L, 6L, NA, 8L, 6L, 6L, 6L, 9L, 3L, 3L, 9L, 
7L, 8L, 4L, 3L, 3L, 3L, 6L, 6L, 6L, 3L, 4L, 3L, 3L, 6L, 4L, 3L, 
3L, 4L, 6L, 0L, 3L, 6L, 4L, 3L, 3L, 7L, 4L, 4L, 3L, 1L, 6L, 4L, 
6L, 5L, 3L, 6L, 6L, 3L, 6L, 3L, 5L, 6L, 6L, 3L, 6L, 4L, 9L, 7L, 
6L, 3L, 3L, 3L, 4L, 6L, 3L, 6L, 3L, 4L, 4L, 3L, 5L, 5L, 5L, 0L, 
6L, 4L, 5L, 6L, 4L, 1L, 6L, 3L, 6L, 6L, 2L, 5L, 5L, 6L, 5L, 4L, 
5L, 4L, 6L, 5L, 0L, 3L, 4L, 4L, 1L, 4L, 3L, 5L, 8L, 1L, 4L, 4L, 
2L, 2L, 3L, 3L, 5L, 3L, 4L, 3L, 3L, 3L, 3L, 3L, 2L, 1L, 4L, 3L, 
5L, 4L, 7L, 5L, 2L, 4L, 1L, 5L, 3L), Time_Wake = c(500L, 715L, 
600L, 600L, 700L, 600L, 700L, 500L, 500L, 500L, 500L, 700L, 645L, 
700L, 630L, 645L, 700L, 600L, 700L, 550L, 700L, 730L, 725L, 800L, 
600L, 640L, 600L, 730L, 650L, 830L, 630L, 630L, 830L, 722L, 641L, 
800L, 720L, 700L, NA, NA, NA, 700L, 700L, 622L, 710L, 632L, 400L, 
640L, 700L, 730L, 830L, 659L, 800L, 701L, 700L, 900L, 930L, 650L, 
930L, NA, 700L, 300L, 830L, 800L, 705L, 647L, 800L, NA, 830L, 
NA, 830L, 838L, 650L, 849L, 500L, 830L, 800L, 321L, 700L, 400L, 
400L, NA, 700L, 600L, 604L, 700L, 730L, 700L, 700L, 500L, 700L, 
630L, 500L, 600L, 700L, 600L, 830L, 600L, 500L, 600L, 738L, 758L, 
645L, 702L, NA, 500L, 849L, 656L, 831L, 700L, 700L, 805L, 834L, 
849L, 407L, 739L, NA, 717L, 852L, 826L, 446L, 919L, 842L, 754L, 
900L, NA, 845L, 900L, 848L, 757L, 927L, 500L, 700L, 430L, 430L, 
NA, 600L, NA, 452L, 700L, NA, 300L, 600L, 600L, 400L, 945L, 500L, 
700L, 700L, 700L, 504L, 700L, 700L, 400L, 747L, NA, 200L, 740L, 
441L, 833L, 815L, 400L, 600L, 600L, 700L, 344L, NA, 636L, 600L, 
NA, 300L, 600L, 600L, 700L, NA, 822L, 360L, 600L, 945L, NA, 656L, 
400L, 700L, 744L, 710L, 600L, NA, 700L, 700L, 700L, 253L, 600L, 
819L, 700L, 600L, 655L, 835L, 848L, 654L, 630L, 745L, 300L, 730L, 
700L, 700L, 700L, 700L, NA, 200L, 700L, 500L, NA, 500L, 700L, 
700L, 730L, 700L, 830L, 825L, 700L, 600L, 700L, 700L, NA, 700L, 
700L, 700L, 700L, 700L, 700L, 300L, 500L, 700L, 705L, NA, 700L, 
723L, 531L, 841L, 845L, 744L, 742L, 830L, 648L, NA, 630L, 645L, 
634L, 727L, 603L, 648L, 721L, 647L, 842L, 750L, 650L, NA, 645L, 
645L, 751L, 130L, 729L, NA, 830L, 730L, 727L, 709L, 641L, NA, 
709L, 710L, 621L, 747L, 720L, 628L, 654L, NA, 633L, 548L, 428L, 
700L, 733L, 700L, 556L, 757L, 815L, 735L, NA, 500L, 707L, 751L, 
601L, 500L, NA, NA, 600L, 800L, 607L, 557L, 723L, 718L, 630L, 
400L, 633L, 550L, 607L, 621L, 640L, 636L, 559L, 417L, 701L, 100L, 
640L, 629L, 614L, 545L, 615L, 550L, 755L, NA, 737L, 725L, 613L, 
713L, 555L, 746L, 634L, 651L, 701L, NA, NA, 810L, 646L, 442L, 
NA, 641L, 642L, 700L, 525L, 332L, 743L, 703L, 718L, 625L, NA, 
618L, NA, NA, 400L, 329L, 748L, 700L, 720L, 602L, 727L, 842L, 
842L, 400L, NA, 704L, 808L, 632L, 746L, 405L, NA, 559L, 649L, 
639L, 709L, 720L, 349L, 650L, 630L, NA, NA, 630L, NA), Mins_Work = c(435L, 
350L, 145L, 135L, 15L, 60L, 60L, 390L, 395L, 395L, 315L, 80L, 
580L, 175L, 545L, 230L, 435L, 370L, 255L, 515L, 330L, 65L, 115L, 
550L, 420L, 45L, 266L, 196L, 198L, 220L, 17L, 382L, 0L, 180L, 
343L, 207L, 263L, 332L, 0L, 0L, 259L, 417L, 282L, 685L, 517L, 
111L, 64L, 466L, 499L, 460L, 269L, 300L, 427L, 301L, 436L, 342L, 
229L, 379L, 102L, 146L, NA, 94L, 345L, 73L, 204L, 512L, 113L, 
135L, 458L, 493L, 552L, 108L, 335L, 395L, 508L, 546L, 396L, 159L, 
325L, 747L, 650L, 377L, 461L, 669L, 186L, 220L, 410L, 708L, 409L, 
515L, 413L, 166L, 451L, 660L, 177L, 192L, 191L, 461L, 637L, 297L, 
601L, 586L, 270L, 479L, 0L, 480L, 397L, 174L, 111L, 0L, 610L, 
332L, 345L, 423L, 160L, 611L, 0L, 345L, 550L, 324L, 427L, 505L, 
632L, 560L, 230L, 495L, 235L, 522L, 654L, 465L, 377L, 260L, 572L, 
612L, 594L, 624L, 237L, 0L, 38L, 409L, 634L, 292L, 706L, 399L, 
568L, 0L, 694L, 298L, 616L, 553L, 581L, 423L, 636L, 623L, 338L, 
345L, 521L, 438L, 504L, 600L, 616L, 656L, 285L, 474L, 688L, 278L, 
383L, 535L, 363L, 470L, 457L, 303L, 123L, 363L, 329L, 513L, 636L, 
421L, 220L, 430L, 428L, 536L, 156L, 615L, 429L, 103L, 332L, 250L, 
281L, 248L, 435L, 589L, 515L, 158L, 0L, 649L, 427L, 193L, 225L, 
0L, 280L, 163L, 536L, 301L, 406L, 230L, 519L, 0L, 303L, 472L, 
392L, 326L, 368L, 405L, 515L, 308L, 259L, 769L, 93L, 517L, 261L, 
420L, 248L, 265L, 834L, 313L, 131L, 298L, 134L, 385L, 648L, 529L, 
487L, 533L, 641L, 429L, 339L, 508L, 560L, 439L, 381L, 397L, 692L, 
534L, 148L, 366L, 167L, 425L, 315L, 476L, 384L, 498L, 502L, 308L, 
360L, 203L, 410L, 626L, 593L, 409L, 531L, 157L, 0L, 357L, 443L, 
615L, 564L, 341L, 352L, 609L, 686L, 386L, 323L, 362L, 597L, 325L, 
51L, 570L, 579L, 284L, 0L, 530L, 171L, 640L, 263L, 112L, 217L, 
152L, 203L, 394L, 135L, 234L, 507L, 224L, 174L, 210L, 138L, 52L, 
326L, 413L, 695L, 370L, 256L, 327L, 490L, 128L, 469L, 567L, 359L, 
561L, 478L, 233L, 550L, 390L, 406L, 56L, 47L, 258L, 332L, 114L, 
193L, 435L, 493L, 659L, 93L, 86L, 0L, 228L, 232L, 318L, 295L, 
639L, 367L, 313L, 253L, 433L, 399L, 269L, 446L, 407L, 424L, 410L, 
309L, 364L, 700L, 345L, 274L, 113L, 202L, 553L, 157L, 351L, 303L, 
392L, 539L, 337L, 297L, 479L, 311L, 173L, 94L, 170L, 469L, 180L, 
311L, 106L, 521L, 378L, 61L, 462L, 644L, 310L, 533L, 517L, 136L, 
0L, 392L, NA)), class = "data.frame", row.names = c(NA, -378L
))

The data in my model consists of two predictors: Coffee_Cups, which are how many coffee cups were consumed that day, and Time_Wake, which is the time of waking up that day. Work_Environment here was selected as a random effect with two levels to emulate what you did, but below I express that this isn't a wise choice. The expected outcome or y in this model is Mins_Work, or how many minutes of productivity were recorded that day. You can see the structure of the data with this code:

work %>% 
  select(Work_Environment,
         Coffee_Cups,
         Time_Wake,
         Mins_Work) %>%
  glimpse()

As seen here:

Rows: 378
Columns: 4
$ Work_Environment <chr> "Office", "Office", "Office", "Home", "Home", …
$ Coffee_Cups      <int> 3, 0, 2, 6, 4, 5, 3, 3, 2, 2, 3, 1, 1, 3, 2, 2…
$ Time_Wake        <int> 500, 715, 600, 600, 700, 600, 700, 500, 500, 5…
$ Mins_Work        <int> 435, 350, 145, 135, 15, 60, 60, 390, 395, 395,…

Observing Missing Values

It's not clear if you have missing values in your data, as the summary output from your lmer summary doesn't appear to have the portion that states how many observations there were. That said, you can check your missing data fairly easily in R. I have checked both of my fixed effect predictors here:

#### Check NA Values ###
work %>% 
  select(Mins_Work) %>% 
  is.na() %>% 
  table() # 2 NA values
work %>% 
  select(Time_Wake) %>% 
  is.na() %>% 
  table() # 43 NA values

Then I checked what the total observations are in my data.

nrow(work) # 378 observations

Since there are only 378 observations and 45 missing, we should have 333 values observed by the lmer model. Here I have built my lmer model below:

hlm.work <- lmer(Mins_Work
                 ~ Coffee_Cups
                 + Time_Wake
                 + (1|Work_Environment),
                 data = work)
summary(hlm.work)

As can be seen, the model retained only the 333 observations in the "Random effects" section in the output:

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: Mins_Work ~ Coffee_Cups + Time_Wake + (1 | Work_Environment)
   Data: work
REML criterion at convergence: 4338.3
Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.51991 -0.70343 -0.01399  0.73630  2.93450
Random effects:
 Groups           Name        Variance Std.Dev.
 Work_Environment (Intercept)  4289     65.49

 Residual                     27230    165.01

Number of obs: 333, groups:  Work_Environment, 2
Fixed effects:
             Estimate Std. Error        df t value Pr(>|t|)

(Intercept) 400.69391   64.43207   3.34404   6.219  0.00604 ** 
Coffee_Cups  28.71402    4.14095 329.56783   6.934 2.16e-11 ***
Time_Wake    -0.18917    0.06431 329.14364  -2.942  0.00350 **

Signif. codes:  0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
            (Intr) Cff_Cp
Coffee_Cups -0.172

Time_Wake   -0.626 -0.124

Multicollinearity

The VIF in your model can be quickly observed with the performance package. The code can be seen below:

library(performance)
check_collinearity(hlm.work)

Which shows our VIF isn't high enough to warrant concern:

# Check for Multicollinearity
Low Correlation
    Term  VIF    VIF 95% CI Increased SE Tolerance Tolerance 95% CI

Coffee_Cups 1.02 [1.00, 17.58]         1.01      0.98     [0.06, 1.00]
   Time_Wake 1.02 [1.00, 17.58]         1.01      0.98     [0.06, 1.00]

Appropriateness of Model

As to your question about the suitability of your model, there is nothing wrong specified from what I can see, but as some others have noted, you may have too few random effects to warrant their inclusion into this model. You may want to either consider a fixed effect regression or if there are other random effects you think contribute to "noise" that have enough levels, you can include them instead.

Hello @Shawn Hemelstand. Thanks so much for answering my question! Thankfully I do not have any NA values in my data, so I do not need to filter them out. I did check the VIF with the package you recommended and see High Coliniarity between soil_type and treatment and moderate correlation between days, so that is likely an issue I am running into. — mbelumn, Sep 14 '22 at 20:17
Good to hear. Also please accept my answer if it was helpful :) — Shawn Hemelstrand, Sep 14 '22 at 21:32

Is my mixed effect model formula incorrect in R?

1 Answers1

Data

Observing Missing Values

Multicollinearity

Appropriateness of Model