7

I'm still a bit confused about the question is the decision boundary of a logistic classifier linear? I followed Andrew Ng's machine learning course on Coursera, and he mentioned the following:

enter image description here

It seems to me there is no one answer, it depends on the linearity or non-linearity of the decision boundary, and that depends on the hypothesis function defined as $H_\theta(X)$ where $X$ is the input and $\theta$ are the variables of our problem.

Could you please help me to solve this doubt?

2 Answers2

5

There are various different things that can be meant by "non-linear" (cf., this great answer: How to tell the difference between linear and non-linear regression models?) Part of the confusion behind questions like yours often resides in ambiguity about the term non-linear. It will help to get clearer on that (see the linked answer).

That said, the decision boundary for the model you display is a 'straight' line (or perhaps a flat hyperplane) in the appropriate, high-dimensional, space. It is hard to see that, because it is a four-dimensional space. However, perhaps seeing an analog of this issue in a different setting, which can be completely represented in a three-dimensional space, might break through the conceptual logjam. You can see such an example in my answer here: Why is polynomial regression considered a special case of multiple linear regression?

0

If you do no feature engineering, that is, the input to sigmoid is simply ax+b then you will get a linear boundary such that

h(x) = 1 for ax+b > 0.5 and 
h(x) = 0 for ax+b < 0.5

It is when you do feature engineering that the nature of the boundary starts to become nonlinear in the 2D plan. Let's say your input is not just x now but x,x^2, and x^3.

Now your decision boundary might look something like this,

h(x) = 1 for ax+bx^2+cx^3+d > 0.5 and 
h(x) = 0 for ax+bx^2+cx^3+d < 0.5, 

which is non-linear.