2

I have data where participants were assessed at two timepoints ; baseline and follow up. At baseline, participants were categorised based on presence of a marker (yes = 1, no = 0). At follow-up, participants underwent examination whether they developed a certain disease. Time periods between baseline and follow up differed between each participant.

I am interested in answering the question whether the presence of the marker puts participants at a greater risk to develop the disease (earlier). I used a cox proportional harzards regression to answer this question and the marker turned out to be significant.

However, many participants dropped out before follow up, i.e., they all have time = 0 and disease_time2 = NA. I performed the cox regression on participants who did not drop out and I am concerned about selection bias (right-censoring).

I read that inverse probability weighting (IP-weighting) is a way to account for selection bias but I am unsure whether in my case such a procedure is applicable.

My data looks like this:

ID   disease_time2      months   censored   marker   covariate1  covariate2
a    0                  66        0          0         15           9
b    NA                  0        1          1         .            .
c    1                  30        0          1         .            .
d    NA                  0        1          0         .            .
e    0                  45        0          0         .            .

This is my try of IP weighting, based on this book:

############ WEIGHTING ##############
## 1) MARKER
# 1.1) Fit a logistic model for my data, denominator weights for marker
denom.fit <- glm(marker~  months+ covariate1 + covariate2,
                          family = binomial(), data = dat)
        # predicted probabilities
predict_denom <- predict(denom.fit, type = "response")

1.2) estimation of numerator of ip weights for marker

numer.fit <- glm(marker~ 1, family = binomial(), data = dat) predict_num <- predict(numer.fit, type = "response")

2) CENSORING

2.1) estimation of denominator of ip weights for censored

denom.cens <- glm(censored~ marker + months+ covariate1+ covariate2, family = binomial(), data = dat) predict_cens_denom <- 1-predict(denom.cens, type = "response")

2.2) estimation of numerator of ip weights for censored

numer.cens <- glm(censored~ marker, family = binomial(), data = dat_noNA) predict_cens_num <- 1- predict(numer.cens, type = "response")

sw.a <- ifelse(dat$disease_time2 == 0, ((1-predict_num)/(1-predict_denom)), (predict_num/predict_denom)) sw.c <- predict_cens_num/predict_cens_denom sw <- sw.c*sw.a

########### FINAL MODEL WITH WEIGHTS ########## m2_ip<- coxph(Surv(months, disease) ~ marker, data = dat, weights = sw)

Additional question based on comments

Would it make any difference if I had time specifications for dropouts? (values for month but not for ``disease_time2```)

  • 1
    By definition, one cannot perform a time-to-event analysis in units/people who have no follow-up. The addition of propensity analysis does not correct the absence of follow-up time. – Todd D Jun 06 '22 at 14:32
  • Do you have any advice how to account for the missing follow up? I was told to address this issue somehow. @ToddD – a.henrietty Jun 06 '22 at 14:42
  • 1
    As I understand your issue, there is no way to correct. – Todd D Jun 06 '22 at 15:18
  • I had the idea to perform some kind of sensitivity analysis/ simulation to determine how stable the results would have been with different outcomes of drop-outs. Have you stumbled across such a procedure by chance? @ToddD – a.henrietty Jun 06 '22 at 15:30
  • 1
    One can do a censoring analysis, but not when one group’s outcome is completely determined by a covariate value. – Todd D Jun 06 '22 at 15:34
  • Would it make a difference if we had time specifications (month) for the dropouts? Are non-NA values of disease_time2 (my outcome variable) required to perform ip-weighting? @ToddD – a.henrietty Jun 07 '22 at 07:44
  • I think that my answer to your related question also covers this. If you have no data on disease status after time = 0 for an individual, that individual makes no contribution to the (partial) likelihood of a survival model at subsequent times. Thus there is nothing to "weight" in the survival models for such individuals. – EdM Jun 07 '22 at 16:33

0 Answers0