3

I've created a mixed effects model using lme4 in R:

model <- lmer(GrowthRate ~ I(Year-1970) * factor(PlotType) + (1 + I(Year-1970)|Plot), data = dat)
  • Or a working example for SE:

    model <- lmer(uptake ~ conc * factor(Type) + (1 + conc|Plant), data = CO2)
    

Next, I use predict to determine the trend:

pred <- predict(model,re.form = NA)

I want to calculate 95% confidence intervals around my trend line(s), but predict.merMod does not calculate standard errors. The authors suggest using bootMer:

There is no option for computing standard errors of predictions because it is difficult to define an efficient method that incorporates uncertainty in the variance parameters; we recommend bootMer for this task.

So I use bootMer:

boot <- bootMer(model, predict, nsim = 10000, re.form = NA)

This is where I'm stuck.

How do I extract standard errors from bootMer? (and how do I convert these to CIs?)

Attempts:

I. This source suggests to do the following for a single predicted value:

b <- bootMer(fm1, nsim=100, function(x) predict(x, newdata = data.frame(Year = 1990), re.form=NA))

(paraphrasing): b$t lists the output of the function we applied to each of the bootstrap resamples, in this case the fitted values at the given predicted value of x. The standard error of our predicted value can be estimated simply as the standard deviation of the sampling distribution: sd(b$t) and the confidence interval as the quantiles of the distribution (in this case, a 95% confidence interval): quantile(b$t, probs=c(0.025, 0.975))

However, does this approach "scale up" to calculate the CIs for the entire trend line(s)?

  • It doesn't appear to, since sd(boot$t) only gives a single value as it did with b$t in the above example.

II. This source uses an entirely different approach:

#Bootstrapped CIs:     [modified slightly for this question]

#Create model matrix:
    mm <- model.matrix(~ I(Year - 1970) * factor(PlotType),dat)

#define a function to be applied to the nsim simulations:
  #(the function basically gets a merMod object and returns the fitted values)
  predFun <- function(.) mm%*%fixef(.) 

#supply function to bootMer:
  bb <- bootMer(m, FUN = predFun, nsim = 200) #do this 200 times

#as we did this 200 times the 95% CI will be bordered by the 5th and 195th value   <<--- #WHAT DOES THIS MEAN??
  bb_se <- apply(bb$t,2,function(x) x[order(x)])
  blo <- bb_se[1,]
  bhi <- bb_se[2,]

I don't understand 2 things from this approach:

  1. Why they created predFun(), and whether that is more valid than simply using predict(re.form = NA)

  2. How their apply approach (and failure to multiple SEs by 1.96) actually get them 95% confidence intervals...

III. When I look at the output of bootMer (i.e., by entering my saved object boot in my console using print(boot)), I get the following output with length(pred) number of rows:

Call:
bootMer(x = model, FUN = predict, nsim = 10000, re.form = NA)


Bootstrap Statistics :
       original       bias    std. error
t1*    8.235689 -0.239843268   1.0835271
t2*    8.154482 -0.232643708   1.0579350
t3*    8.046205 -0.223044295   1.0265019
.....

So it appears bootMer is already calculating standard errors (std.error). However, I can't for the life of me figure out how to access these values.

  • I assume this would be the easiest approach if anyone knows how to do so...

    • Update: So I found out I can reproduce the std.error values from print(boot) using the following: apply(boot$t, 2, sd). However, now this calls into question whether the boot output is showing standard errors or simply standard deviations...

Final Question(s):

  1. Can anyone explain the above methods and their pros and shortcomings?

  2. Which of these above methods is most appropriate?

  3. Is there an alternative approach I should instead use?


Update:

Here is my final attempt/approach (having not yet received a response to my Q):

  #load packages:
    library(lme4)
    library(ggplot2)

  #Build model:
    model <- lmer(uptake ~ conc * factor(Type) + (1 + conc|Plant), data = CO2)

  #Predict values:
    pred <- predict(model,re.form = NA)

  #Bootstrap CI:
    boot <- bootMer(model, predict, nsim = 100, re.form = NA)
    std.err <- apply(boot$t, 2, sd)
    CI.lo <- pred - std.err*1.96
    CI.hi <- pred + std.err*1.96

  #Plot
   ggplot(CO2, aes(x = conc, y = uptake, colour = Type)) +
    geom_line(aes(y = pred),size=2) +
    geom_line(aes(x = conc,y = uptake,group = Plant), alpha = 0.2) +
    geom_ribbon(aes(ymin = CI.lo, ymax = CI.hi),size=2, alpha = 0.03, linetype = 0)

Other than being a really bad model fit, does this approach seem legitimate and sound?

Jake Westfall
  • 12,557
  • really, I'll also take answers that definitively and clearly show how to calculate bootstrapped 95% CI bands around my LMM model trend line... – theforestecologist May 02 '18 at 15:38
  • 2
    Well here's a familiar face... :) I think the question I marked as duplicate is close to the best answer you'll get. The big question is really "which confidence intervals are you really talking about?" because there are many non-equivalent ways to approach the problem. The merTools vignette https://cran.rstudio.com/web/packages/merTools/vignettes/Using_predictInterval.html was quite helpful to me, whether or not you actually end up using merTools. – Bryan Krause May 02 '18 at 16:41
  • 1
    As far as how to compute confidence intervals...merTools can do it for you simply, but in the bootstrapping realm you are talking about empirical confidence intervals - they aren't going backwards to compute a standard error because they aren't assuming the distribution of the error must be normal. They are literally getting a list of possible outputs, and saying that the 97.5th percentile of those possibilities is an upper and 2.5th percentile is a lower bound for an overall 95% confidence interval. – Bryan Krause May 02 '18 at 16:42
  • Well hey, @Bryan! Hmm definitely a related post, but it doesn't quite answer how to use bootMer. I've gotten so far with this approach, I'd at least like to know how to finish the job before trying an alternative approach ;) – theforestecologist May 02 '18 at 16:43
  • Let me know if my second comment helps with that. – Bryan Krause May 02 '18 at 16:44
  • Also, the merTools vignette I linked also has a comparison with bootMer including all the necessary code - I'd start there if what I said didn't make sense. – Bryan Krause May 02 '18 at 16:45
  • hmm certainly makes sense theoretically, yeah. I have to think how to apply that to bootMer...aaand based on your last comment, I 'll start by examining the vignette. Thanks! – theforestecologist May 02 '18 at 16:46
  • 2
    I took the liberty of editing the question title to make it more clear to casual observers that this is not just a question about how to do something in R (if it were, it would be off-topic here). Hope that's okay. – Jake Westfall May 02 '18 at 20:26

1 Answers1

5

I've found two very near similar approaches for calculating 95% confidence intervals. (Similar in that they produce almost the same CIs).

Both start with running the bootMer() function:

    merBoot <- bootMer(model, predict, nsim = 100, re.form = NA)

The approaches are:

  1. Using a quantile approach ala Knowles & Frederick (2016):

    CI.lower = apply(merBoot$t, 2, function(x) as.numeric(quantile(x, probs=.025, na.rm=TRUE)))
    CI.upper = apply(merBoot$t, 2, function(x) as.numeric(quantile(x, probs=.975, na.rm=TRUE)))
    
  2. Pulling standard errors from bootMer object and converting to CIs via CI = mean +/- 1.96*SE:

    std.err <- apply(merBoot$t, 2, sd)
    CI.lower <- pred - std.err*1.96
    CI.upper <- pred + std.err*1.96
    

Note: these might just be similar in my case because the test data I used might have been normal enough. I don't believe the first approach requires normalcy, while the standard error approach relies on such an assumption.

  • This is old now, but I would really appreciate your thoughts on the following. Firstly, doesn't the first approach give prediction intervals, rather than confidence intervals? The authors in the linked vignette call the output a 95% prediction interval. Secondly, did you ever find out why bootMer's std.error is equal to the output of apply(boot$t, 2, sd)? – Ezra Herman Nov 18 '22 at 13:14
  • I've answered the first part of my Q: the answer here uses re.form = NA in predict, which results in predictions that do not take random effects into account. As a result, quantile returns a CI. In contrast, in the reference re.form = NULL is used. This results in predictions that account for random effects and a prediction interval from quantile. See here and here. – Ezra Herman Nov 23 '22 at 12:19