0

I need some help/advice for tackling the following problem: interested in exploring the association between various covariates and the reason for termination from a data registry. The hypothesis/suspicion is that a lower proportion of ethnic/racial minorities will get kidney transplant rather than dialysis at time of end stage kidney disease onset (i.e., time of termination from registry).

Outcome is reason for termination (the dataset has this as a categorical variable with numerous categories which I narrowed down to only 4 categories: transplant, dialysis, death, other). The "other" category is the most numerous as it contains everybody terminated for any reason other than transplant/dialysis/death, plus over 300 cases who had missing value for reason for termination (these patients might still be in the registry for all we know, there's no way of knowing what happened with them), followed by transplant, dialysis and death (just a couple of cases of death).

Covariates: some are time-independent (sex, race/ethnicity) while others are measured repeatedly at 6-months visits so time-dependent (such as lab values, hypertension status, eGFR).

I'm thinking that either option A or B might work here:

A. repeated measures multinomial logistic regression analysis, given the outcome with more than 2 categories, and the time-dependent nature of some of the covariates, or

B. repeated measures competing risks/cause-specific hazards analysis

However, I am not sure if I even need to consider repeated measures here, as the outcome itself (reason for termination) is fixed. Nor am I sure if I even have competing risks per se - I think the idea is that we will see some racial effects where minorities get the transplant less frequently and are more often receiving dialysis but I don't really know if we need competing risks for this? Now, If I do need to account for the repeated measures nature of some covariates, I have no experience with how to do this for multinomial or competing risks models and I haven't been able to find many resources that are easy to understand and provide specific examples for implementing in R or SAS.

Can somebody please help advise what the appropriate type of analysis would be given the study context/research hypothesis?

Thank you kindly!

1 Answers1

1

A key here is your statement:

over 300 cases who had missing value for reason for termination (these patients might still be in the registry for all we know, there's no way of knowing what happened with them)...

If you don't even know that those individuals have left the registry, then their time durations within the registry only have a lower limit of the time between entry and the time of last follow up, a right censoring of their durations in the registry. That situation calls for some type of survival analysis.

You dichotomy of solutions isn't quite so stark as you might think. The choice depends on the nature of your observation times. If they are effectively continuous, then a competing-risks survival model would be a reasonable choice. If instead you have evaluations of all individuals at, say, regular 6-month intervals, you would have a "discrete-time" survival model that is essentially (in your case) a multinomial regression on a "person-period" data set. For each time interval you would have one row for each individual at risk during that interval, with the covariate values in place during that interval and an indicator of the outcome during that interval (with no record of a terminating event being a possibility).

The R competing risks vignette outlines the procedure for continuous time. The counting-process Surv(startTime, stopTime, eventType) data format allows for time-varying covariate values. With at most one event possible per individual, you don't need to treat this formally as a repeated-measures analysis if your model (like a proportional hazards model) only evaluates covariates at event times. See this page. There are many pages on this site dealing with discrete-time survival; this page contains some references for further study. Discrete-time survival models are binomial regressions if there is only one type of event, but they can be extended to multinomial regression to handle your situation.

EdM
  • 92,183
  • 10
  • 92
  • 267