I am conducting a logistic regression with a binary outcome (start and not start). My mix of predictors are all either continuous or dichotomous variables.
Using the Box-Tidwell approach, one of my continuous predictors potentially violates the assumption of linearity of the logit. There is no indication from goodness-of-fit statistics that fit is problematic.
I have subsequently run the regression model again, substituting the original continuous variable with: firstly, a square root transformation and secondly, a dichotomous version of the variable.
On inspection of the output, it seems that goodness-of-fit improves marginally but residuals become problematic. Parameter estimates, standard errors, and $\exp(\beta)$ remain relatively similar. The interpretation of the data does not change in terms of my hypothesis, across the 3 models.
Therefore, in terms of usefulness of my results and sense of interpretation of data, it seems appropriate to report the regression model using the original continuous variable.
I am wondering this:
- When is logistic regression robust against the potential violation of the linearity of logit assumption?
- Given my above example, does it seem acceptable to include the original continuous variable in the model?
- Are there any references or guides out there for recommending when it is satisfactory to accept that the model is robust against the potential violation of linearity of the logit?
y, age, blood.pressure, sex) or was this just meant to be pseudocode? – Macro Jul 10 '13 at 23:32require(rms)then?lrmthenexamples(lrm)– Frank Harrell Jul 11 '13 at 11:23