2

I have data from a live-attenuated vaccine study and I want to estimate the distribution of shedding times after vaccination. I have samples collected at multiple time points for each subject, and I can classify the results at each time point as positive or negative for shedding vaccine. Subjects are observed to either never be positive at any time point, or transition from positive to negative at later time points (or never turn negative and thus are right-censored).

IFF I could assume each subject who received the vaccine does shed for some finite amount of time, I could model this as a standard interval-censored survival analysis.

But, biologically, there is zero-inflation at the start. If one were to do the study with many more time points, one would see that some subjects never shed at all and others shed for short times that end before our first time point. These are two different states--did the vaccine "take" or not, and if so, how long was it shed for?--and should be modeled as a hurdle-survival model.

This is a pretty common situation in infectious disease challenge studies, but I am unable to locate any papers that focus on this kind of mixture survival model (the opposite of a cure mixture model), let alone an R package.

Absent this "right way" to do it, I will do something easy enough that's worked before: model the mixture parametrically using the wrong score function. I know a priori that a lognormal survival model is a good one, and so I can fit that to the proportion data at each time point, assuming a binomial scoring function, and add a free parameter that the probability of shedding at the start is between 0 and 1.

Any tips?

Thanks!

famulare
  • 362

1 Answers1

1

The problem I see with your suggestion is that I don't think it can distinguish those for whom the vaccination didn't "take" at all from those who "took" quickly and stopped shedding before the first observation time. That seems to be an important distinction.

As you say about whether the vaccination takes at all and whether shedding then stops, "These are two different states." Consider a multi-state survival model. Your situation seems to be a simple version of the case illustrated in the top right of Figure 1 of the R vignette on Multi-state models and competing risks. You start by getting vaccinated (baseline state at time = 0), and then must go through the "shedding" state before you can enter the "stopped shedding" state. The vignette illustrates ways to model such multi-state models with the standard survival package, and provides links to other R software than can do such modeling.

Your point about what happens before your first observation times is well considered. State changes prior to the first observation time have left-truncated event times--you have no information about those events or event times whatsoever.* In principle, left truncation is taken directly into account in the Surv(startTime, stopTime, status) counting-process data format used to handle multi-state models: the startTime is treated as a left truncation. There can be practical problems if there are very few cases with the earliest left-truncation times. For parametric models, this page summarizes the forms of the likelihoods for exact, censored, and truncated observations.


*That's to be distinguished from left-censored event times, where you know that an event has already occurred but you don't know exactly when. For example, if someone is already shedding at the first observation time, then the time that shedding started is left-censored.

EdM
  • 92,183
  • 10
  • 92
  • 267