0

I am wanting to fit a logistic regression where some explanatory variables have null values. The nulls are meaningful - for instance, a continuous explanatory variable capturing 'time since last treatment', where for some individuals, they would have no prior treatment. I have a few explanatory variables like this.

I recall a previous CV question/answer where adding a binary flag (has_prev_treatment) and an interaction between this flag and the continuous variable (noting that we replace the missing with 0's in the continuous variables) allows me to handle the meaningful missings. Can't find this question anymore, and just thought I'd double check with the community if this is the correct approach for this type of missing variable. Additionally, can this method be used when regularlisation is used?

Update: found these related questions, but uncertain whether they agree on the use of the flagging method.

Could I treat the missing in this case as variables that are 'non applicable' for the observation? Handling NAs in a regression ?? Data Flags?

It is the accepted answer here: How to handle non existent (not missing) data?

Meep
  • 284

0 Answers0