0

I see many questions on this topic, but I promise none seem to explain what I'm after.

I want to understand how to tie the coefficients I get from a logistic regression model back to the model equation that relates log odds to the sum of coefficients and variables.

Consider this example. Below are the predicted values for out of sample data.

log_model=LogisticRegression()
y_pred=log_model.predict_proba(X_test)

So the predicted values are probabilities (between 0 and 1) of being in class 1. I will call these $\hat{y}_{i}$

The coefficients are the following:

 log_model.coef_     
 array([[-0.9495, 3.4599]])

I'm going to call the dependent variable (as outputed by y_pred) $y$ and independent $x$ and $z$.

Now is this the correct model equation?

model equation latex

Is my understanding correct, that the model (in python) gives us predicted probabilities, but what goes on under the hood is that the dependent variable is actually the log odds? (I understand that one can go from probabilities to log odds and vice versa)

vvv
  • 161
  • You are incorrect. The dependent variable is an indicator variable taking the value 1 if the event of interest occurs and 0 otherwise. The expectation of the dependent variable is the probability that the event of interest occurs. The coefficients are the marginal changes in log odds as a result of changes in relevant marginal variables. – Jesper for President Mar 19 '24 at 08:15
  • Why there's no intercept in the equation? – Happy Cretine Mar 19 '24 at 12:07
  • log_model.predict_proba(X_test) outputs probabilities, so it's not an indicator variable. And I'm not interpreting the coefficients, I am just trying to relate the coefficients to the output probabilities and make sure the equation is what I think it is. There is no intercept because the I scaled the variables – vvv Mar 19 '24 at 16:26

0 Answers0