Does the value of the target for binary classification matter? If so, how?

Question

In binary classification, we usually take our label $t$ as a number, that is either $\{-1, +1\}$ or $\{0, 1\}$.

But practically speaking, it can be any number: $\{1, 2\}$ for instance also makes intuitive sense, item $1$ or item $2$.

But does such choice affect learning algorithm or learning theory itself?

To get useful answers, I think you'll need to clarify if you're asking about how machine learning software works (where such codings definitely matter) or whether you're asking if you can re-write a learning algorithm using one choice of encoding with a different choice of encoding & how to do so. — Sycorax, Feb 06 '20 at 04:56

score 1 · Accepted Answer · edited Feb 06 '20 at 04:57

Yes, it does -- for some binary classification procedures. For example, gradient boosting is formulated around the assumption that the labels are -1 or 1 (See "Boosting and Additive Trees" Chapter in Elements of Statistical Learning). That's mainly because of the type of loss that boosting uses ($e^{-y{F(x)}})$, where you can easily see that using $\{0, 1\}$ will effect the negative samples differently.

Since, this can be trivially fixed, I think most online implementations take care of this internally. However, if you're just asking out of curiosity or if you're implementing an algorithm yourself, you need to be aware of such details.

Does the value of the target for binary classification matter? If so, how?

1 Answers1