How to plot decision boundary in R for logistic regression model?

Question

I made a logistic regression model using glm in R. I have two independent variables. How can I plot the decision boundary of my model in the scatter plot of the two variables. For example, how can I plot a figure like here.

The link to the figure is dead. – Nick Stauner Oct 19 '15 at 22:48 — Nick Stauner, Oct 19 '15 at 22:48

score 29 · Accepted Answer · edited Jun 11 '20 at 14:32

29

set.seed(1234)

x1 <- rnorm(20, 1, 2)
x2 <- rnorm(20)

y <- sign(-1 - 2 * x1 + 4 * x2 )

y[ y == -1] <- 0

df <- cbind.data.frame( y, x1, x2)

mdl <- glm( y ~ . , data = df , family=binomial)

slope <- coef(mdl)[2]/(-coef(mdl)[3])
intercept <- coef(mdl)[1]/(-coef(mdl)[3]) 

library(lattice)
xyplot( x2 ~ x1 , data = df, groups = y,
   panel=function(...){
       panel.xyplot(...)
       panel.abline(intercept , slope)
       panel.grid(...)
       })

alt text

I must remark that perfect separation occurs here, therefore the glm function gives you a warning. But that is not important here as the purpose is to illustrate how to draw the linear boundary and the observations colored according to their covariates.

edited Jun 11 '20 at 14:32

Community

1

answered Jan 13 '11 at 02:44

suncoolsu

6,622

I hope I am not old fashioned if I use lattice :-) – suncoolsu Jan 13 '11 at 02:47
2

I also hope that if this is a HW problem, you will not simply copy paste. – suncoolsu Jan 13 '11 at 02:54
Thanks. This is not a HW question and the answer is helpful for me to understand my model. – user2755 Jan 13 '11 at 04:25
oh yes you are :) – mpiktas Jan 13 '11 at 08:09
1

Can someone explain me the logic behind the slope and intercept? (regarding the logistic model) – Fernando Jan 09 '13 at 12:29
Very good exemple! thanks for this easy to understand code. – carlosedubarreto Oct 14 '13 at 21:42
I got this part pretty easily, but I am interested in using a decision boundary other than 0.5; is there a straightforward way to shift the line based on the different decision boundary? – jdj081 Oct 10 '18 at 22:26

Andy · Answer 2 · 2015-07-05T17:36:08.080

Wanted to address the question in comment to the accepted answer above from Fernando: Can someone explain the logic behind the slope and intercept?

The hypothesis for logistics regression takes the form of:

$$h_{\theta} = g(z)$$

where, $g(z)$ is the sigmoid function and where $z$ is of the form:

$$z = \theta_{0} + \theta_{1}x_{1} + \theta_{2}x_{2}$$

Given we are classifying between 0 and 1, $y = 1$ when $h_{\theta} \geq 0.5$ which given the sigmoid function is true when:

$$\theta_{0} + \theta_{1}x_{1} + \theta_{2}x_{2} \geq 0$$

the above is the decision boundary and can be rearranged as:

$$x_{2} \geq \frac{-\theta_{0}}{\theta_{2}} + \frac{-\theta_{1}}{\theta_{2}}x_{1}$$

This is an equation in the form of $y = mx + b$ and you can see then why $m$ and $b$ are calculated the way they are in the accepted answer

If we classify $y=1$ based on $h_θ ≥ t$ for a general $t$ between 0 and 1, we have $θ_0 + θ_1 x_1 + θ_2 x_2 ≥ \text{log odds}(t)$. — husB, Dec 06 '23 at 10:32

How to plot decision boundary in R for logistic regression model?

2 Answers2

Linked