1

I would like to investigate whether social economic status (SES) would be associated with the age of onset of mental illness, after controlling some possible confounders, in the cohort of patients with mental illness. No censoring data were included as all the data were from mental illness cohort (all patients have age of onset of their mental illness)

My questions are:

  1. Can I use Kaplan-Meier Curve to compare the “survival time” (from birth to age of onset across SES group?
  2. In a regression situation (when adjusting for multiple variables), what model to use is more appropriate? Cox regression? accelerated failure time model? Or simply apply linear regression model using age of onset as a continuous dependent variable? What are the differences when applying these three different models?
Nick Cox
  • 56,404
  • 8
  • 127
  • 185
Abby
  • 11

1 Answers1

0

I would like to investigate whether social economic status (SES) would be associated with the age of onset of mental illness, after controlling some possible confounders, in the cohort of patients with mental illness.

That's tricky and might be misleading, as the results you get won't necessarily model associations of SES with onset of mental illness across the entire population, just within a cohort already known to have developed mental illness. Many individuals never experience the event of being diagnosed with mental illness, while standard survival analysis implicitly assumes that all individuals ultimately experience the event.

From the survival analysis perspective, the situation is probably best represented by a cure model that includes the probability of never having the event along with modeling the time to event for those that experience it. But that can't be done without information about those who never develop mental illness.

Restricting to the "cohort of patients with mental illness" thus can lead to problems in generalization. For example, there's no assurance that a shorter time-to-event associated with some SES group within that cohort will also mean a higher ultimate probability of the event in that SES group in the broader population. The ultimate probability in the entire population could even be lower in that SES group.

For your application there are further problems, depending on what type of "mental illness" you're investigating. Your cohort presumably consists of individuals who have been diagnosed with some mental illness, and the event time is the date of diagnosis. Yet that date of diagnosis is likely to be well after an individual has experienced mental illness. With some types of mental illness, like depression, many individuals never get formally diagnosed. That leads to the problem that, even if all else is independent of SES, the probability or speed of being diagnosed might well depend on SES. Then you are modeling diagnosis times, not underlying biology, as a function of SES.

You will have to be very careful in how you interpret any results that you get. All your results will be conditional on an individual being in in a cohort like the one you are examining. Yet your cohort might suffer from substantial sampling bias.

Response to specific questions

As kjetil b halvorsen explains on this page, survival data even without censoring are typically best handled with a survival model. Remember that you are modeling a distribution of survival times as a function of covariates. Standard linear regression assumes a normal distribution of outcome values around the point estimate based on predictor values. That assumption might not hold well for survival data.

The choice between a proportional hazards model (Cox semi-parametric, or fully parametric) and an accelerated failure time model depends on how the covariates affect the time to event. See this page for a brief guide to how to evaluate which might work best in your situation.

EdM
  • 92,183
  • 10
  • 92
  • 267