I am building a Logistic Regression model (in sklearn) and want to verify that the assumption regarding the linearity between X and the logit function is correct.
I am using Python so am looking for an alternative to Box Tidwell (because coding this up doesn't come as easy as in SPSS as far as I am aware). I have devised what I think is an appropriate alternative but I wanted to double check that I am correct in this way of thinking.
What I have done is:
Built my model and fitted it to some training data Sampled 100 evenly spaced points in between the min and max of my independent variable X and calculated the probability of these points predicted by my model (using the predict_proba function) Plotting the sampled X points against the logit of the probability and observed that there is a linear relationship (note that predict_proba does return the probability of the samples belonging to each class so I just picked one of the classes) Doing this makes me believe that I have not violated the assumption but am I unknowingly already assuming linearity in this method? Is this method valid?
Thanks!
EDIT: I know this type of question has been asked a lot. However, they all go with the Box Tidwell method (or in a few cases ANOVA) which is something I want to try and avoid. So I just wanted to check if the method I used was valid :)