I would like to perform inference on a binary classification problem.
I have a logistic regression with a mix of binary and continuous inputs. I would like to perform a feature cross (e.g. add some non-linear terms) in my regression and then infer how these variables increase or decrease the probability of observing a positive result. So, I would like to be able to make a statement about my predictors when taking into consideration non-linear interactions between the two.
So if I have binary variable x1 and continuous variable x2 (ranges from 0 to 1) and perform a simple logistic regression using statsmodels for example, I might get back something along the lines of...
| variable | coef | other columns such as std error, p-value, etc. |
|---|---|---|
| constant | -4 | ~ |
| x1 | -2 | ~ |
| x2 | -0.2 | ~ |
From this, assuming everything is significant, I might conclude that increases in x1 and x2 are both associated with decreased probability of observing a positive result. But if I also believed there might be a non-linear effect and wanted to test a logistic regression with the following argument, instead: ~ ax1+bx2+cx1x2
I would get the following output from statsmodels
| variable | coef | other columns such as std error, p-value, etc. |
|---|---|---|
| constant | -4 | ~ |
| x1 | -3 | ~ |
| x2 | -0.2 | ~ |
| x1*x2 | 6 | ~ |
I can see that the non-linear term increases the probability of a positive result, but the coefficient for x2 has also changed, which is expected, but the inference is no longer simple. If there was no error on the estimates of the coefficients, it might be easy to say that when x2 is above the value of 3/5.8 and x1 is 1, that the probability of a positive result increases. I come to this conclusion since -3*(x1)-0.2*(x2)+6*(x1)(x2)>0 simplifies to -3+5.8(x2)>0 when x1 = 1. But since my coefficients have a range of values, I can't perform this simple algebra.
Ultimately, is there some way to gauge whether or not variables like x1 and x2 in fact increase the probability of observing a positive result when factoring in non-linear interactions like the one I outlined? Is there perhaps a more simple way to perform such an analysis that I have overlooked (maybe logistic regression is not the way to go)? Finally, would such an approach change if x1 and x2 were both continuous, or both binary?
I am usually more interested in the construction of classifiers as opposed to inferences that one can make based off of the classifier model that gets fit, so I apologize if this question comes across naïve.
margins. However, the correspondingmargeffin statsmodels only supports single column terms but not multicolumn terms like interacted effects. The marginal effect in nonlinear models depends on the values of explanatory variables and the effect might not be monotonic over the space of explanatory variables. – Josef Oct 10 '22 at 17:03