1

In classification problem(e.g. in the click-through-rate prediction problem which is to predict whether a user will click an item or not), we usually use +1 for positive label and 0 for negative label. This is naturally good. But is it a necessity? what's the difference between using -1 for negative label, or even more generally, can we use +100 for positive label and -100 for negative label?

dingx
  • 220
  • It is not necessary, but can be convenient (similarly with dummy variables and indicator variables). Meanwhile, if you use an intermediate number between the minimum and the maximum to indicate "possibly" then there is a natural use of the range $[0,1]$ to indicate probability – Henry Dec 10 '22 at 17:52
  • 1
  • +1 to what @Dave said - the target encoding depends on the way in which the learning algorithm is formulated, and doesn't make any real difference (provided you use the appropriate encoding for your algorithm). One slight difference is that for e.g. logistic regression models you can use targets of $\epsilon$ and $1 - \epsilon$ which has a mild regularising effect (or equivalently could be viewed as expressing some uncertainty on the labels). This was used sometimes as a heuristic for neural networks back in the day (1980s/90s) so that the network could converge completely to a local minima. – Dikran Marsupial Dec 24 '23 at 13:52

1 Answers1

0

As also stated in the comments, it is not a necessity. It's a convenience when formulating your problem. It's also useful to think about how you're going to penalize misclassifications. 0-1 labelling naturally allows you to use cross entropy, which is typically a good loss function choice for the problem, however for $\{-1,1\}$, you need to play with the result a bit or use a different loss. However, intuitively, this is not better than $[0,1]$ range. For $\{-100,100\}$, it seems like it's equivalent to $\{-1,1\}$ range, however, a couple of mistakes can make your learning much harder due to numerical issues.

gunes
  • 57,205