3

I know what an odds is. It's a ratio of the probability of some event happening to the probability it not happening. So, in the context of classification, the probability that an input feature vector $X$ belongs to class 1 is $p(X)$ then the Odds is:-

$O = \frac{p(X)}{1-p(X)}$

This is what I don't understand. When we have probability, why do we need Odds here at all?

Dhiraj
  • 251
  • 1
    What you called an odds ratio, is actually an odds. The odds ratio is the ratio of two odds. I edited the question to correct this. – Maarten Buis Apr 03 '17 at 08:43
  • 3
    You may be interested in the concept of link functions – Maarten Buis Apr 03 '17 at 08:47
  • I don't why it's so hard to understand. What about you just move the log to the other side of the equation? – SmallChess Apr 03 '17 at 10:50
  • I got confused about what I wanted to ask -- but the main thing is I was not able to get intuitive feeling of why the concept of odds is required in the context of logistic regression. This is what I have explained in the answer. Maybe I didn't ask the right question to begin with, but this is what I wanted to understand. – Dhiraj Apr 03 '17 at 11:00
  • rephrased the question for the answer I have already specified. – Dhiraj Apr 05 '17 at 22:54

3 Answers3

4

I think I figured out the answer myself after doing a bit of reading so thought of posting it here. It looks like I got little confused.

So as per my post

$$O = \frac{P(X)}{1-P(X)}.$$

So I forgot to take into account the fact that $P(X)$ itself is the probability given by the logistic function:-

$$P_\beta(X) = \frac{e^{\beta^TX}}{1 + e^{\beta^TX} }.$$

So replacing this in in the equation for $O,$ we get

$$O = \frac{\frac{e^{\beta^TX}}{1 + e^{\beta^TX} }}{1-\frac{e^{\beta^TX}}{1 + e^{\beta^TX} }} = e^{\beta^TX}.$$

So $e^{\beta^TX}$ is nothing but the odds for the input feature vector $X$ to be of a positive class. And with further algebraic manipulation, we can obtain a linear form and the reason for doing this is to be able to interpret the coefficient vector $\beta$ in precise manner. So that algebraic manipulation is basically taking a natural log of the latest form of $O $ ($e^{\beta^TX}$)

i.e.

$$\ln(O) = \ln \left(e^{\beta^TX}\right) =\beta^TX $$

So the expanded form of $\beta^TX$ is:-

$$\ln(O) = \beta_0+\beta_1x_1+\beta_2x_2+\cdots+\beta_nx_n$$

So the real use of this, as I have understood it, is to be able to interpret the coefficients easily while keeping the linear form just like in multiple linear regression. So looking at the latest expanded form of $\ln(O)$ we can say that a unit increase in $x_i$ causes the log of Odds to increase by $\beta_i.$

User1865345
  • 8,202
Dhiraj
  • 251
  • Yup. The main reason is to try and obtain a linear combination that is interpretable in a similar way to linear regresion. Not to say different ways can't be useful, but this one is very neat. – Diego Queiroz Sep 10 '22 at 13:31
0

In the equation: $$ ln (p/1-p)= \beta_0 + \beta_1X $$ The range of the right hand term is $(-\infty,+\infty)$ while without log the range of the left hand term $(p/1-p)$ is $(0,\infty)$.

Without log, we are using a $(-\infty,+\infty)$ predictor set to map $(0,\infty)$ values, which is not possible. Log transforms the range of $(p/1-p)$ from $(0,\infty)$ to $(-\infty,+\infty)$.

0

Why do we need natural log of Odds in Logistic Regression?

The log of odds is logistic regression by definition. So the question is more like, why do we need logistic regression? For this there are several questions that already deal with this

Difference between logit and probit models

Why sigmoid function instead of anything else?

What is the difference between linear regression and logistic regression?