1

I am doing a study that I would like to analyze data with a mixed effect model. I have three fixed effects and one random effect. My code output is seems a bit off and I am unsure if it is a structural problem with the code, or multicoliniarity in my data. I have tried testing fpr coliniarity, but haven't gotten the code to work (I think because I have mostly catagorical data)

Fixed Effects:
soil_type - 2 groups ("Medium" and "Fine")
treatment - 4 groups ("L", "S", "K", "C")
days - 3 sampling points (day 4, 11, 18)

Random Effect: replicates - 3 groups (1,2,3)

My code is as followed:

cl.mod <- lmer(weighted_cl ~ soil_type + treatment + days + (1|rep),
               data = leach.conc, REML = FALSE)

I believe I have the correct code, but when I look at the output I am unsure if I am missing values:

Fixed effects:
                Estimate Std. Error t value
(Intercept)      7.01992    1.65258   4.248
soil_typeMedium -4.81518    1.05431  -4.567
treatmentKCl    10.48391    1.49103   7.031
treatmentLiquid 25.25578    1.49103  16.939
treatmentSolid   8.31138    1.49103   5.574
days            -0.32534    0.09223  -3.527

Correlation of Fixed Effects: (Intr) sl_tyM trtmKC trtmnL trtmnS soil_typMdm -0.319
treatmntKCl -0.451 0.000
treatmntLqd -0.451 0.000 0.500
treatmntSld -0.451 0.000 0.500 0.500
days -0.614 0.000 0.000 0.000 0.000*

I am unsure if this is because I have multicoliniarity in my data or if I have my code incorrect (or both). Because this is catagorical data, I am unsure about testing for coliniarity. Also, I have no missing values in my data. Any guidance or things to look at would be greatly appreciated!

mbelumn
  • 35
  • 2
    Your random effect has three levels? That's a very low number. – Jeremy Miles Sep 09 '22 at 16:17
  • @JeremyMiles yes, that is how many replicates we extracted from our plots. Unsure how to increase that number aside from increasing replication our next field season – mbelumn Sep 09 '22 at 18:09
  • 1
    Consider whether you need a mixed effects model at all, when there are simpler procedures that may address your need: https://stats.stackexchange.com/a/582056/121522 – mkt Sep 10 '22 at 09:41

1 Answers1

0

Data

So there are a number of questions in your question, and given that your data isn't on hand, I will use my own toy dataset to answer your question. First, the dput of my data in case you want to check yourself:

work <- structure(list(Work_Environment = c("Office", "Office", "Office", 
"Home", "Home", "Office", "Office", "Office", "Office", "Office", 
"Home", "Home", "Office", "Office", "Office", "Home", "Office", 
"Home", "Home", "Office", "Office", "Home", "Office", "Home", 
"Home", "Home", "Office", "Office", "Office", "Office", "Home", 
"Home", "Home", "Office", "Office", "Office", "Office", "Office", 
"Home", "Home", "Office", "Office", "Home", "Home", "Office", 
"Home", "Home", "Office", "Office", "Home", "Home", "Office", 
"Home", "Home", "Office", "Office", "Home", "Office", "Home", 
"Home", "Home", "Home", "Office", "Home", "Office", "Office", 
"Home", "Home", "Office", "Office", "Home", "Home", "Office", 
"Office", "Home", "Office", "Office", "Home", "Office", "Office", 
"Home", "Home", "Office", "Office", "Home", "Home", "Office", 
"Home", "Home", "Office", "Office", "Home", "Office", "Office", 
"Home", "Home", "Office", "Home", "Home", "Home", "Home", "Home", 
"Home", "Office", "Home", "Office", "Office", "Home", "Home", 
"Home", "Office", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Office", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Office", "Home", "Office", "Home", "Office", 
"Office", "Office", "Office", "Home", "Home", "Home", "Home", 
"Office", "Home", "Office", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Office", "Office", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Office", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Office", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Office", "Office", "Office", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Office", "Office", "Home", "Home", 
"Office", "Office", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Home", "Home", "Home", 
"Home", "Home", "Home", "Home", "Home", "Office", "Office", "Office", 
"Home", "Office", "Home", "Home", "Home", "Home"), Coffee_Cups = c(3L, 
0L, 2L, 6L, 4L, 5L, 3L, 3L, 2L, 2L, 3L, 1L, 1L, 3L, 2L, 2L, 0L, 
1L, 1L, 4L, 4L, 3L, 0L, 1L, 3L, 0L, 0L, 0L, 0L, 2L, 0L, 1L, 2L, 
3L, 2L, 2L, 4L, 3L, 6L, 6L, 3L, 4L, 6L, 8L, 3L, 5L, 0L, 2L, 2L, 
8L, 6L, 4L, 6L, 4L, 4L, 2L, 6L, 6L, 5L, 1L, 3L, 1L, 5L, 4L, 6L, 
5L, 0L, 6L, 6L, 4L, 4L, 2L, 2L, 6L, 6L, 7L, 3L, 3L, 0L, 5L, 7L, 
6L, 3L, 5L, 3L, 3L, 1L, 9L, 9L, 3L, 3L, 6L, 6L, 6L, 3L, 0L, 7L, 
6L, 6L, 3L, 9L, 3L, 8L, 8L, 3L, 3L, 7L, 6L, 3L, 3L, 3L, 6L, 6L, 
6L, 1L, 9L, 3L, 3L, 2L, 6L, 3L, 6L, 9L, 6L, 8L, 9L, 6L, 6L, 6L, 
0L, 3L, 0L, 3L, 3L, 6L, 3L, 0L, 9L, 3L, 0L, 2L, 0L, 6L, 6L, 6L, 
3L, 6L, 3L, 9L, 3L, 0L, 0L, 6L, 3L, 3L, 3L, 3L, 6L, 0L, 6L, 3L, 
3L, 5L, 5L, 3L, 0L, 6L, 4L, 2L, 0L, 2L, 4L, 0L, 6L, 4L, 4L, 2L, 
2L, 0L, 9L, 6L, 3L, 6L, 6L, 9L, 0L, 6L, 6L, 6L, 6L, 6L, 6L, 3L, 
3L, 0L, 9L, 6L, 3L, 6L, 3L, 6L, 1L, 6L, 6L, 6L, 6L, 6L, 1L, 3L, 
9L, 6L, 3L, 6L, 9L, 3L, 5L, 6L, 3L, 0L, 6L, 3L, 3L, 5L, 0L, 6L, 
3L, 5L, 3L, 0L, 6L, 7L, 3L, 6L, 6L, 6L, 6L, 3L, 5L, 6L, 7L, 6L, 
6L, 4L, 6L, 4L, 5L, 5L, 6L, NA, 8L, 6L, 6L, 6L, 9L, 3L, 3L, 9L, 
7L, 8L, 4L, 3L, 3L, 3L, 6L, 6L, 6L, 3L, 4L, 3L, 3L, 6L, 4L, 3L, 
3L, 4L, 6L, 0L, 3L, 6L, 4L, 3L, 3L, 7L, 4L, 4L, 3L, 1L, 6L, 4L, 
6L, 5L, 3L, 6L, 6L, 3L, 6L, 3L, 5L, 6L, 6L, 3L, 6L, 4L, 9L, 7L, 
6L, 3L, 3L, 3L, 4L, 6L, 3L, 6L, 3L, 4L, 4L, 3L, 5L, 5L, 5L, 0L, 
6L, 4L, 5L, 6L, 4L, 1L, 6L, 3L, 6L, 6L, 2L, 5L, 5L, 6L, 5L, 4L, 
5L, 4L, 6L, 5L, 0L, 3L, 4L, 4L, 1L, 4L, 3L, 5L, 8L, 1L, 4L, 4L, 
2L, 2L, 3L, 3L, 5L, 3L, 4L, 3L, 3L, 3L, 3L, 3L, 2L, 1L, 4L, 3L, 
5L, 4L, 7L, 5L, 2L, 4L, 1L, 5L, 3L), Time_Wake = c(500L, 715L, 
600L, 600L, 700L, 600L, 700L, 500L, 500L, 500L, 500L, 700L, 645L, 
700L, 630L, 645L, 700L, 600L, 700L, 550L, 700L, 730L, 725L, 800L, 
600L, 640L, 600L, 730L, 650L, 830L, 630L, 630L, 830L, 722L, 641L, 
800L, 720L, 700L, NA, NA, NA, 700L, 700L, 622L, 710L, 632L, 400L, 
640L, 700L, 730L, 830L, 659L, 800L, 701L, 700L, 900L, 930L, 650L, 
930L, NA, 700L, 300L, 830L, 800L, 705L, 647L, 800L, NA, 830L, 
NA, 830L, 838L, 650L, 849L, 500L, 830L, 800L, 321L, 700L, 400L, 
400L, NA, 700L, 600L, 604L, 700L, 730L, 700L, 700L, 500L, 700L, 
630L, 500L, 600L, 700L, 600L, 830L, 600L, 500L, 600L, 738L, 758L, 
645L, 702L, NA, 500L, 849L, 656L, 831L, 700L, 700L, 805L, 834L, 
849L, 407L, 739L, NA, 717L, 852L, 826L, 446L, 919L, 842L, 754L, 
900L, NA, 845L, 900L, 848L, 757L, 927L, 500L, 700L, 430L, 430L, 
NA, 600L, NA, 452L, 700L, NA, 300L, 600L, 600L, 400L, 945L, 500L, 
700L, 700L, 700L, 504L, 700L, 700L, 400L, 747L, NA, 200L, 740L, 
441L, 833L, 815L, 400L, 600L, 600L, 700L, 344L, NA, 636L, 600L, 
NA, 300L, 600L, 600L, 700L, NA, 822L, 360L, 600L, 945L, NA, 656L, 
400L, 700L, 744L, 710L, 600L, NA, 700L, 700L, 700L, 253L, 600L, 
819L, 700L, 600L, 655L, 835L, 848L, 654L, 630L, 745L, 300L, 730L, 
700L, 700L, 700L, 700L, NA, 200L, 700L, 500L, NA, 500L, 700L, 
700L, 730L, 700L, 830L, 825L, 700L, 600L, 700L, 700L, NA, 700L, 
700L, 700L, 700L, 700L, 700L, 300L, 500L, 700L, 705L, NA, 700L, 
723L, 531L, 841L, 845L, 744L, 742L, 830L, 648L, NA, 630L, 645L, 
634L, 727L, 603L, 648L, 721L, 647L, 842L, 750L, 650L, NA, 645L, 
645L, 751L, 130L, 729L, NA, 830L, 730L, 727L, 709L, 641L, NA, 
709L, 710L, 621L, 747L, 720L, 628L, 654L, NA, 633L, 548L, 428L, 
700L, 733L, 700L, 556L, 757L, 815L, 735L, NA, 500L, 707L, 751L, 
601L, 500L, NA, NA, 600L, 800L, 607L, 557L, 723L, 718L, 630L, 
400L, 633L, 550L, 607L, 621L, 640L, 636L, 559L, 417L, 701L, 100L, 
640L, 629L, 614L, 545L, 615L, 550L, 755L, NA, 737L, 725L, 613L, 
713L, 555L, 746L, 634L, 651L, 701L, NA, NA, 810L, 646L, 442L, 
NA, 641L, 642L, 700L, 525L, 332L, 743L, 703L, 718L, 625L, NA, 
618L, NA, NA, 400L, 329L, 748L, 700L, 720L, 602L, 727L, 842L, 
842L, 400L, NA, 704L, 808L, 632L, 746L, 405L, NA, 559L, 649L, 
639L, 709L, 720L, 349L, 650L, 630L, NA, NA, 630L, NA), Mins_Work = c(435L, 
350L, 145L, 135L, 15L, 60L, 60L, 390L, 395L, 395L, 315L, 80L, 
580L, 175L, 545L, 230L, 435L, 370L, 255L, 515L, 330L, 65L, 115L, 
550L, 420L, 45L, 266L, 196L, 198L, 220L, 17L, 382L, 0L, 180L, 
343L, 207L, 263L, 332L, 0L, 0L, 259L, 417L, 282L, 685L, 517L, 
111L, 64L, 466L, 499L, 460L, 269L, 300L, 427L, 301L, 436L, 342L, 
229L, 379L, 102L, 146L, NA, 94L, 345L, 73L, 204L, 512L, 113L, 
135L, 458L, 493L, 552L, 108L, 335L, 395L, 508L, 546L, 396L, 159L, 
325L, 747L, 650L, 377L, 461L, 669L, 186L, 220L, 410L, 708L, 409L, 
515L, 413L, 166L, 451L, 660L, 177L, 192L, 191L, 461L, 637L, 297L, 
601L, 586L, 270L, 479L, 0L, 480L, 397L, 174L, 111L, 0L, 610L, 
332L, 345L, 423L, 160L, 611L, 0L, 345L, 550L, 324L, 427L, 505L, 
632L, 560L, 230L, 495L, 235L, 522L, 654L, 465L, 377L, 260L, 572L, 
612L, 594L, 624L, 237L, 0L, 38L, 409L, 634L, 292L, 706L, 399L, 
568L, 0L, 694L, 298L, 616L, 553L, 581L, 423L, 636L, 623L, 338L, 
345L, 521L, 438L, 504L, 600L, 616L, 656L, 285L, 474L, 688L, 278L, 
383L, 535L, 363L, 470L, 457L, 303L, 123L, 363L, 329L, 513L, 636L, 
421L, 220L, 430L, 428L, 536L, 156L, 615L, 429L, 103L, 332L, 250L, 
281L, 248L, 435L, 589L, 515L, 158L, 0L, 649L, 427L, 193L, 225L, 
0L, 280L, 163L, 536L, 301L, 406L, 230L, 519L, 0L, 303L, 472L, 
392L, 326L, 368L, 405L, 515L, 308L, 259L, 769L, 93L, 517L, 261L, 
420L, 248L, 265L, 834L, 313L, 131L, 298L, 134L, 385L, 648L, 529L, 
487L, 533L, 641L, 429L, 339L, 508L, 560L, 439L, 381L, 397L, 692L, 
534L, 148L, 366L, 167L, 425L, 315L, 476L, 384L, 498L, 502L, 308L, 
360L, 203L, 410L, 626L, 593L, 409L, 531L, 157L, 0L, 357L, 443L, 
615L, 564L, 341L, 352L, 609L, 686L, 386L, 323L, 362L, 597L, 325L, 
51L, 570L, 579L, 284L, 0L, 530L, 171L, 640L, 263L, 112L, 217L, 
152L, 203L, 394L, 135L, 234L, 507L, 224L, 174L, 210L, 138L, 52L, 
326L, 413L, 695L, 370L, 256L, 327L, 490L, 128L, 469L, 567L, 359L, 
561L, 478L, 233L, 550L, 390L, 406L, 56L, 47L, 258L, 332L, 114L, 
193L, 435L, 493L, 659L, 93L, 86L, 0L, 228L, 232L, 318L, 295L, 
639L, 367L, 313L, 253L, 433L, 399L, 269L, 446L, 407L, 424L, 410L, 
309L, 364L, 700L, 345L, 274L, 113L, 202L, 553L, 157L, 351L, 303L, 
392L, 539L, 337L, 297L, 479L, 311L, 173L, 94L, 170L, 469L, 180L, 
311L, 106L, 521L, 378L, 61L, 462L, 644L, 310L, 533L, 517L, 136L, 
0L, 392L, NA)), class = "data.frame", row.names = c(NA, -378L
))

The data in my model consists of two predictors: Coffee_Cups, which are how many coffee cups were consumed that day, and Time_Wake, which is the time of waking up that day. Work_Environment here was selected as a random effect with two levels to emulate what you did, but below I express that this isn't a wise choice. The expected outcome or y in this model is Mins_Work, or how many minutes of productivity were recorded that day. You can see the structure of the data with this code:

work %>% 
  select(Work_Environment,
         Coffee_Cups,
         Time_Wake,
         Mins_Work) %>%
  glimpse()

As seen here:

Rows: 378
Columns: 4
$ Work_Environment <chr> "Office", "Office", "Office", "Home", "Home", …
$ Coffee_Cups      <int> 3, 0, 2, 6, 4, 5, 3, 3, 2, 2, 3, 1, 1, 3, 2, 2…
$ Time_Wake        <int> 500, 715, 600, 600, 700, 600, 700, 500, 500, 5…
$ Mins_Work        <int> 435, 350, 145, 135, 15, 60, 60, 390, 395, 395,…

Observing Missing Values

It's not clear if you have missing values in your data, as the summary output from your lmer summary doesn't appear to have the portion that states how many observations there were. That said, you can check your missing data fairly easily in R. I have checked both of my fixed effect predictors here:

#### Check NA Values ###
work %>% 
  select(Mins_Work) %>% 
  is.na() %>% 
  table() # 2 NA values

work %>% select(Time_Wake) %>% is.na() %>% table() # 43 NA values

Then I checked what the total observations are in my data.

nrow(work) # 378 observations

Since there are only 378 observations and 45 missing, we should have 333 values observed by the lmer model. Here I have built my lmer model below:

hlm.work <- lmer(Mins_Work
                 ~ Coffee_Cups
                 + Time_Wake
                 + (1|Work_Environment),
                 data = work)
summary(hlm.work) 

As can be seen, the model retained only the 333 observations in the "Random effects" section in the output:

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: Mins_Work ~ Coffee_Cups + Time_Wake + (1 | Work_Environment)
   Data: work

REML criterion at convergence: 4338.3

Scaled residuals: Min 1Q Median 3Q Max -2.51991 -0.70343 -0.01399 0.73630 2.93450

Random effects: Groups Name Variance Std.Dev. Work_Environment (Intercept) 4289 65.49
Residual 27230 165.01
Number of obs: 333, groups: Work_Environment, 2

Fixed effects: Estimate Std. Error df t value Pr(>|t|)
(Intercept) 400.69391 64.43207 3.34404 6.219 0.00604 ** Coffee_Cups 28.71402 4.14095 329.56783 6.934 2.16e-11 *** Time_Wake -0.18917 0.06431 329.14364 -2.942 0.00350 **


Signif. codes: 0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects: (Intr) Cff_Cp Coffee_Cups -0.172
Time_Wake -0.626 -0.124

Multicollinearity

The VIF in your model can be quickly observed with the performance package. The code can be seen below:

library(performance)
check_collinearity(hlm.work)

Which shows our VIF isn't high enough to warrant concern:

# Check for Multicollinearity

Low Correlation

    Term  VIF    VIF 95% CI Increased SE Tolerance Tolerance 95% CI

Coffee_Cups 1.02 [1.00, 17.58] 1.01 0.98 [0.06, 1.00] Time_Wake 1.02 [1.00, 17.58] 1.01 0.98 [0.06, 1.00]

Appropriateness of Model

As to your question about the suitability of your model, there is nothing wrong specified from what I can see, but as some others have noted, you may have too few random effects to warrant their inclusion into this model. You may want to either consider a fixed effect regression or if there are other random effects you think contribute to "noise" that have enough levels, you can include them instead.

  • Hello @Shawn Hemelstand. Thanks so much for answering my question! Thankfully I do not have any NA values in my data, so I do not need to filter them out. I did check the VIF with the package you recommended and see High Coliniarity between soil_type and treatment and moderate correlation between days, so that is likely an issue I am running into. – mbelumn Sep 14 '22 at 20:17
  • Good to hear. Also please accept my answer if it was helpful :) – Shawn Hemelstrand Sep 14 '22 at 21:32