This is a simple exploratory R question i hope somebody can help me with. I am trying to model accelerated failure time models on data that is arbitrarily censored (right, left, or interval censored) and some observations are left truncated. So far i have not been able to find a package that seems to be able to do that, but there are so many packages and it is certainly possible one of then has the capability but it does not advertise it well. I started with looking through this overview of survival packages in R CRAN Task View: Survival Analysis, but so fair without luck. Hope you can help.
-
Are time-varying covariates involved? That can pose problems for AFT models with left truncation. – EdM Dec 16 '23 at 09:09
-
No, they are luckily fixed and not just in a theoretical sense. The covariates are continuous variables like physical dimensions and constant factors like material and use purpose. – Nikolaj Pedersen Dec 16 '23 at 09:23
1 Answers
First, make sure that you have properly classified the cases with respect to censoring and truncation status. It's easy to get confused. See this page and this page, for example.
Second, if worse comes to worst, you can in principle write code to maximize the log-likelihood of the data based on the contributions of different types of censored/truncated observations to the likelihood, based on your parametric AFT form. See this page for a summary.
Third, if you have left-truncated observations you need to use the counting-process format. That can pose a problem in general with AFT models, as the likelihood contribution from a left-truncated observation involves the survival function at the left-truncation time.
This is the problem: a common use of left truncation is to define the start of a time interval during which a time-varying covariate value is in place. To define the survival function at a left-truncated observation time with time-varying covariates, you thus need to know the history of covariate values back to time = 0 and to keep track of those values for each individual through time.* The R survreg() function doesn't even allow for counting-process data, I suspect because of that problem.
Happily, you don't have time-varying covariates.
Thus I think that you can convince the flexsurvreg() function in the R flexsurv package to do what you need. It can handle both counting-process (left truncation, right censoring) and interval-censored data. I don't think that it directly handles left-censored data in the default R coding, in which the left side of the interval is $-\infty$. For this application, however, you should be able to define your left-censored observations as interval-censored between 0 and the left-censoring time. That would be OK provided that you know that no units had experienced an event at time = 0.
*The aftreg() function of the R eha package evidently can do that, assuming that the first provided covariate value extends back to time = 0, but it doesn't handle interval censoring.
- 92,183
- 10
- 92
- 267
-
Thank you so much! Should i see the difficulty in finding packages that can do this as an indication that it perhaps is a bad idea to use AFT models with truncation? Finding packages that handle all kinds of censoring was fairly easy while anything related to truncation seemed much more difficult, so it is perhaps the uses for truncation that is rarer or more difficult? – Nikolaj Pedersen Dec 16 '23 at 17:09
-
@NikolajPedersen it's not necessarily "a bad idea to use AFT models with truncation" provided that you don't fall into traps like those that can occur with time-varying covariates. I suspect that what's rare is having to deal with multiple types of censoring and truncation in a single data set. See this page. – EdM Dec 16 '23 at 17:38