I have a question regarding the impact that censoring has on the predictions from a survival model. For example, let's say we are trying to estimate the risk a client has to churn in the next 1 month using a survival model, and we have data for 24 months. In this case, should we censor individuals that churn after the 1 month? Or can we use, for example, 12 months as our "study period" (censor individuals with t>12months) and then predict the risk or survival probability for S(t=1month)? What is the difference in performance? Would it vary in-between different models?
-
If the event is churning within a month, then you should use a logistic analysis: a client churning after 1 month is a non-event. You might recoup a little bit of power using a parametric survival model for those cases that are "borderline". – AdamO Jun 01 '22 at 15:50
-
Thank you for your help! What do you mean by borderline cases? And what parametric survival model would you advise? – tomas_s Jun 01 '22 at 15:53
-
If you used, say, a Cox model, an event occurring at 13 months contributes no more information than a logistic model to predict events within 12 months. This is because of the baseline hazard function. An exponential model will gain information from events past 12 months, since the intensity is assumed to be uniform, but exponential is a very specific process; if it fits the data poorly, the results can be misleading. I don't advise any parametric survival model. I say just fit the logistic model based on my superficial understanding of the problem. – AdamO Jun 01 '22 at 15:55
-
Thank you once again for the explanation. I already have a classifier predicting churn events in discrete intervals (1month, 2months, etc), and I'm exploring survival analysis in order to see if I can either improve the performance of the classifier, or at least gain some additional insights (for example with a survival function). In my case I'm not so much interested in predicting the occurrence of events, but more so about ranking the risk of individuals churning. Do you have any suggestion about a 'fair' metric to compare both types of models? Right now I'm using the lift curve. – tomas_s Jun 01 '22 at 16:25
1 Answers
You seem to have fundamentally been doing survival analysis all along, but with discrete rather than continuous survival times. With a single event type, discrete-time survival models are essentially a set of binomial regressions describing the probability of an event during each time period. For each month you presumably only evaluated the probability of an event for those who were still at risk; if so, then you already have been incorporating censoring into your model by removing those no longer at risk. See this page and its links, the work of Willett and Singer, or the book-length description of "Modeling Discrete Time-to-Event Data" by Tutz and Schmid.
If you don't have continuous values for times to churn events but only monthly values over 24 months, then continuing with a discrete-time model makes a lot of sense. That said, the thesis discussed on this page used a Cox model for customer churn based on monthly data over 36 months.
You should take advantage of all the data that you can in a single model, so analysis informed by survival modeling would be a good idea even if it just ended up as a large binomial model. You model time explicitly, and can make the models as rich as you want and as the data allow in terms of covariates to include. You can decide whether to let the data tell you the baseline survival over time (as in a Cox model) or to impose a parametric shape on the baseline survival. This answer discusses ways to define time = 0 for this type of application. You can choose that reference either as the time a customer started with you, or evaluate data over a common time period and perhaps include prior time as a customer as a covariate.
- 92,183
- 10
- 92
- 267
-
Hello and thank you for your detailed response. I'm sorry for the confusion, but my survival models are trained using a continuous time variable (when client was born + when client churns). My main question is two fold: 1) How does varying the study period (determines how many clients will be censored) influence the model's performance (e.g. evaluating lift for clients that churn in a 1,2,n month windows). 2) How to fairly compare the performance of a classification model (that currently uses lift to evaluate 1,2,n month churn windows) and the survival model. – tomas_s Jun 08 '22 at 10:54