Modeling semicontinuous data with upper bound?

Question

I've got a data set from a timed mouse behavioral test. The test ends either when the mouse succeeds the trial or 60 seconds have elapsed. The time is recorded to failure/end. The PI thinks that it's just fine to compare the times between treatment groups. Looking over the data, about 1/3 of the total sample has a 60-second time, indicating that the data set is discontinuous. There is a lot of literature on modeling discontinuous data with a lot of zeroes. Since the 60-second cutoff isn't a measurement but a limit imposed by the method, I want to do a two-part model, mixed-level logistic for the number of "max time" trials and glmer for the remaining data (the animals are tested repeatedly over a week). While I'm certain that this is an appropriate approach, the PI is treating it like Black Magic Heresy. Does anyone have citations to support this approach, specifically using a two-part model where the "inflated" portion of the semicontinuous response is at the upper end of the data range, not the lower?

Before adding an answer, can you please add details to your post concerning your experimental design, what outcomes are being measured, and how they are measured? — Demetri Pananos, Jan 06 '23 at 19:28
Seconds elapsed until reaching target or 60 seconds if the mouse fails to reach the target, as I already explicitly stated. Measured by timer that automatically stops at 60 seconds. — Bryan, Jan 06 '23 at 19:50
Yes, I read that. It would be helpful to know what your research question is. Your data are clearly censored at 60 seconds, so I don't see why, for example, you don't use a survival analysis to compare treatment and control. — Demetri Pananos, Jan 06 '23 at 21:09
Mostly because the professors in the work group would have a cow over doing something that doesn't produce the kind of figure they are expecting. What they want is a line graph that illustrates day vs. mean time to complete task. A K-M style graph would not go over well. In addition, it's a repeated assay. A mouse is put through the process day after day for 14 days, and outcomes are recorded. One outcome of interest is if a group's mean performance changes over time. Does group A get faster over time? Do they do so even faster than does group B? In other words, comparison of slopes. — Bryan, Jan 06 '23 at 21:42
What professors want and what is the best analysis for the data is often not aligned. The bias imparted by the censored outcomes will render whatever analysis they are envisioning biased in a very tangible way. The repeated measures data can be handled with existing tools (see EdM's answer). If they or you remain unpersuaded, the only other thing I might suggest is a zero-one inflated Beta regression. — Demetri Pananos, Jan 06 '23 at 21:50
You can get the type of plot you want from data modeled with the tools of survival analysis, which handle right-censored data properly. You aren't restricted to "survival curves" for display. You plot modeled predictions for each group versus the day of the test. I expanded the answer with example code. — EdM, Jan 06 '23 at 22:37
I've added to the end of my answer to try to address some of your comments. — EdM, Jan 09 '23 at 22:11
If you are having problems convincing your colleagues that handling this as censored data is the correct approach, see if you can get some backup from your local statistics or economics department. Also check the literature in your field. For example, the book "Modeling Psychophysical Data in R" recommends the AER tobit() function for handling censored observations (page 28). — EdM, Jan 10 '23 at 21:54
The issue is that the PI is afraid to methodologically rock the boat. However, I found a citation from a group that does have some clout: https://www.liebertpub.com/doi/10.1089/neu.2020.7089 — Bryan, Jan 17 '23 at 14:04

EdM · Accepted Answer · 2023-01-09T22:08:21.117

Your time values are right-censored. Quoting from Wikipedia:

Right censoring – a data point is above a certain value but it is unknown by how much.

If you ignore the censoring of the time-to-event data, for example just assigning a value of 60 seconds to all trials where the task hadn't been completed by then, you end up with bias. In addition to the brief coverage in the above Wikipedia entry, this is explained extensively in the context of clinical survival data, where it is commonly seen, by Leung et al, Annu. Rev. Public Health 18:83-104 (1997). The same principles hold for any right-censored data, such as yours.

There are well established tools in survival analysis for handling such data. With right censoring you can use a single model that includes appropriate contributions of right-censored observations to the likelihood function; there's no need for a two-stage model. The R survival package is a good place to start. It provides both semi-parametric (Cox) and parametric regression modeling of times to events, with robust sandwich error estimates to account for correlations within individual mice. Alternatively, the coxme package allows for Cox models with random effects. The section on "Random Effect Models" in the R survival task view lists more implementations.

Here's an example of how you can adapt the tools of survival analysis to your situation. Say that you have two groups, and trials on each mouse are performed on a series of days, giving observed elapsed-time values obsReactTime right censored at 60 seconds. For each trial, you have a data row with the mouse ID, the group, a day categorical factor (so as not to impose a linear association between outcome and day), the obsReactTime, and a success indicator (1 for completed the task by 60 seconds, 0 for not). Then the following call to the survreg() function does what you want if you have complete data in a data frame mdataLong and the survival package is loaded:

## see code at end of answer for how these data were generated
model1 <- survreg(Surv(obsReactTime, success) ~ group*day,
            cluster = mouse, data = mdataLong, dist = "gaussian")

Although this is a "survival" model, the survreg() function solves a specific form of a (potentially generalized) linear model. If $Y$ is a right-censored outcome, with covariate values $X$ and corresponding regression coefficients $\beta$, it uses maximum likelihood to solve an equation of the form

$$g(Y) = X'\beta + \sigma W ,$$

where $g()$ is some monotonic function of $Y$, $W$ is a particular standard probability distribution, and $\sigma$ is a scale factor to be modeled. You can think of $\sigma W$ as an error term around the value provided by the linear predictor $X' \beta$.

In this case, $g()$ is the identity function, $W$ is a standard normal distribution, and $\sigma$ is the standard deviation to be estimated. So in this case you have the same form as an ordinary least squares model, but solved via maximum likelihood to deal with the right censoring. Absent censoring, coefficient estimates would be the same as from ordinary least squares (although the different estimation and cluster-adjustment methods can lead to differences in standard-error estimates).

You aren't restricted to showing survival curves from this model. You can use the predict() function to get estimates and standard errors for obsReactTime for each group versus day, the type of plot that you and your colleagues want. Here's an example of such a plot; see code at the end of the answer.

If you have data for each of 14 days you would be better off modeling day continuously but flexibly, for example with a regression spline, instead of as a categorical factor. Continuous modeling of day would also better handle missing data. But the basic analysis would be the same: model an interaction between (a function of) day and group, account for intra-mouse correlations, and interrogate the model to get predictions for each group versus day.

If your colleagues object to use of a "survival" model, you can use the tobit() function in the R AER package to accomplish the same thing under a name that is well respected in econometrics. Under the hood, however, that function just calls survreg().

The emmeans package has tools that can simplify the post-modeling analysis and help evaluate the interaction between day and group with respect to obsReactTime that seems to be your primary interest. For example, the following shows no change from day_0 in groupA but large changes in groupB:

library(emmeans)
emmeans(model1,trt.vs.ctrl1~day|group)$contrasts
# group = A:
#  contrast estimate    SE df t.ratio p.value
#  1 - 0     -0.5628 0.975 53  -0.577  0.7780
#  2 - 0     -0.0756 1.862 53  -0.041  0.9973
# 
# group = B:
#  contrast estimate    SE df t.ratio p.value
#  1 - 0    -12.9307 2.259 53  -5.725  <.0001
#  2 - 0    -16.1287 2.341 53  -6.891  <.0001
# 
# P value adjustment: dunnettx method for 2 tests

Response to comments

Survival modeling

From your perspective, think about this as a method for regression modeling of censored outcomes. You just happen to be using tools that were motivated by and developed for survival analysis. If your data aren't censored (for example, you perhaps have distanceTraveled values for all mice whether or not they completed the task) then there are more efficient ways to proceed. If you have left-censored data, such provided Tobin's motivation for developing what became known as the "tobit" model; that's the default in the tobit() function in the AER package.

Intra-mouse correlations

The cluster = mouse component of the call to survreg() produces a robust infinitesimal jackknife estimate of the coefficient covariance matrix. It's based on the approximate change in the coefficient estimates as each individual mouse is removed from the analysis. It's related to "sandwich" covariance estimates in other contexts. See Section 2.7 of the main survival vignette, or more extensive coverage in Chapter 7 of Therneau and Grambsch.

That marginal-modeling approach keeps the same coefficient estimates as you would get by modeling each observation as if it were independent, but corrects the coefficient standard errors appropriately for the multiple observations per mouse. That includes having multiple observations on the same mouse both within and across days. If you had severe imbalance in the number of trials per mouse this approach might not be optimal, but you seem to have a nicely balanced study design.

Formal mixed modeling would allow you to estimate mouse-specific effects, but those probably aren't of major interest. You presumably are most interested in the behavior of a population of mice represented by these individuals; the above approach does that. If you really need a mixed model for some reason, remember that this is just a form of tobit regression, so a web search on something like "mixed tobit model" should point the way.

Evaluating model fit

Plotting the values of censored observations agains the model predictions isn't very effective here. I suppose that you could plot observations right-censored at 60 seconds with arrows pointing up to indicate that those values are lower limits, but that's not very satisfying.

There is a trick for evaluating the fit of a parametric survival model having right-censored observations. For your case, with the identity transformation of task-completion times $T$ in a Gaussian model, you can rewrite the above equation as:

$$W = \frac{T-X'\beta}{\sigma} $$

for the distribution of scaled deviations of observation times from predicted values. For your Gaussian model, $W$ is standard normal (mean 0, standard deviation 1). If the model is good, then that's the distribution of scaled deviations you should have.

For observations with right-censored values of $T$, those scaled deviations are also right-censored. So you can plot a Kaplan-Meier curve for the scaled deviations, which takes into account the right censoring. Then overlay that curve with the "survival function" of a standard normal distribution (the survival function for any probability distribution is 1 minus its cumulative distribution function).

Code for the above examples. This keeps a set of "actual" task completion times that aren't right-censored in the actualReactTime column. You can compare the survreg() model on the right-censored data in obsReactTime against a model of the uncensored values in actualReactTime.

library(survival)
library(tidyverse)
set.seed(2023)
## 20 mice with initial "actual" task completion times around 55
## completion times > 60 won't be observed
mdata <- data.frame(mouse=1:20,group=rep(c("A","B"),10),time_0=rnorm(20,mean=55,sd=10))
## for day 1, copy over day_0 with error
mdata[,"time_1"] <- mdata$time_0 +rnorm(20,mean=0,sd=3)
## for group B, lower completion times by mean 10, with error
mdata[mdata$group=="B","time_1"] <- mdata[mdata$group=="B","time_1"] -rnorm(10,mean=10,sd=3)
## base day_2 values similarly on day_1 values
## lower group B by additional mean of 5, with error
mdata[,"time_2"] <- mdata$time_1 +rnorm(20,mean=0,sd=3)
mdata[mdata$group=="B","time_2"] <- mdata[mdata$group=="B","time_2"] -rnorm(10,mean=5,sd=3)
##
## put into long form with one observation per data row
mdataLong <- pivot_longer(mdata,cols=3:5,
                           names_to="day", names_prefix="time_",
                           values_to="actualReactTime")
## annotate whether time was less than 60
mdataLong[,"success"] <- mdataLong$actualReactTime < 60
## set up new column for observed times right censored at 60
mdataLong[,"obsReactTime"] <- mdataLong$actualReactTime
mdataLong[mdataLong$actualReactTime>60,"obsReactTime"] <- 60
## about 22% of observations are right censored
##
## Gaussian "survival" regression model handles right censoring
model1 <- survreg(Surv(obsReactTime, success) ~ group*day,
                   cluster=mouse, data=mdataLong, dist = "gaussian")
##
## for plot, get predictions from model
## data frame with predictions for day/group combinations
newData <- expand.grid(day=factor(0:2),group=c("A","B"))
pred1 <- predict(model1,newData,se=TRUE)
pred1df <- data.frame(pred1)
newData <- cbind(newData,pred1df)
##
## do plot based on model predictions
ggplot(data=newData,mapping=aes(x=day,y=fit,group=group,color=group)) + 
     geom_point() + geom_errorbar(aes(ymin=fit-se.fit,ymax=fit+se.fit),width=0.1) + 
     geom_line() + ylab("Modeled time to task completion") +
     ggtitle("Task completion time by group and day",
              subtitle="with standard errors of estimates")

I am not doing survival analysis. I want to model trends of day vs time to complete test. — Bryan, Jan 06 '23 at 21:40
@Bryan you might not be doing survival analysis but your right-censored time-to-complete data need to be handled in the very same way. You can include treatment and days of testing as predictors in a regression model for right-censored data just as you would do with uncensored data. It turns out that the tools are most extensively developed for survival analysis. For example, in econometrics censored data can be analyzed via "tobit regression," but that can be implemented via a call to the R survreg() function. — EdM, Jan 06 '23 at 21:47
@EdM Here, what is the survreg predicting? Is it the hazard? — Demetri Pananos, Jan 07 '23 at 00:41
@DemetriPananos the default for predict.survreg() is to predict the modeled "response" at the specified covariate values, which for a Gaussian-family fit is also the linear-predictor value. — EdM, Jan 07 '23 at 09:48
@EdM +1 This is a spectacular answer and your edits have done much to improve clairty. Well done — Demetri Pananos, Jan 07 '23 at 21:44
Does survreg do nested mixed models? The measures are repeated over time (animal) and each animal is trialed four times per repeat (animal/trial). — Bryan, Jan 09 '23 at 17:35
@Bryan the cluster = mouse argument will account for those intra-mouse correlations via a robust sandwich error estimate. That's different from the Gaussian random-effects approach of a mixed model, but it's a perfectly acceptable way to deal with that issue. If you really need a formal mixed model, this page shows how to use the R censReg and plm packages together to fit a mixed-effects tobit model. Or do a web search for "random effect tobit model." — EdM, Jan 09 '23 at 19:03
@Bryan the GLMMadaptive package can fit a censored-normal distribution like the above to a mixed model, although I haven't used it for that purpose. See its manual on "Extra Family Objects." Don't let the current popularity of (conditional) mixed models necessarily draw you away from the marginal estimates provided by a cluster term and the sandwich error estimate, however. See Frank Harrell's treatment of longitudinal data. — EdM, Jan 09 '23 at 19:15
Can GLMMadaptive handle nested random effects? I though it was limited to single effects. The experiment was nested. Each mouse was tested daily, and each test was administered four times per mouse on each day. — Bryan, Jan 09 '23 at 19:22
Does cluster = mouse handle the nesting of trial within mouse? If so, how? — Bryan, Jan 09 '23 at 19:22
Another measurement they want me to analyze is distance traveled by the animal. How would I do this with a survival model? — Bryan, Jan 09 '23 at 19:32
If I wanted to plot both modeled estimates and actual observed values, how would I calculate mean observed values and errors, taking into account the right censoring? Some of the predictions exceed 60 seconds. — Bryan, Jan 09 '23 at 20:33
I found a way to estimate mean and other summary values: https://search.r-project.org/CRAN/refmans/EnvStats/html/enparCensored.html — Bryan, Jan 10 '23 at 20:01
@Bryan the "normal approximation" in that method should agree with the estimates in the plot, as they are based on the same fundamental model. The non-parametric methods there won't be very helpful in your particular form of censoring, as "the largest value for right-censored data is censored and greater than or equal to the largest uncensored value" in your data. So you will only be able to get a "restricted mean value" from the Kaplan-Meier or bootstrap approaches, which will be an underestimate of the true mean. — EdM, Jan 10 '23 at 20:55
Thank you, it makes sense to me. Now I have to convince full professors of that, because what they insist on is a "mean" of the actually-measured points is the "true" value of the data, and censoring makes them very nervous. They would much rather I pretend that the cut-off is a real measurement and not an an arbitrary end-point. — Bryan, Jan 10 '23 at 21:44

Modeling semicontinuous data with upper bound?

1 Answers1

Linked