In linear regression, $y = w_0 + w_1x_1 + w_2x_2 + \cdots$. The intercept term is called 'bias'. Why is it called bias? And how is this different from the 'bias-variance' trade-off?
-
4It's not called that, you may be thinking of neural networks where they use "bias" and "weights". In linear regression it's called intercept or constant. Bias-variance trade-off is a property of the model as a whole with all of it's components, how good the model is at generalization, while the intercept is just one component of the model (unless your linear model has only the intercept term in which case it is the whole model) and may have only a small effect on the model as a whole. – user2974951 Dec 20 '22 at 07:39
-
It is. Check 'Machine Learning: A Probabilistic Perspective" by Murphey. page 20. – cgo Dec 20 '22 at 07:43
-
4I think Murphey's is a fairly rare and odd use of "bias". You should not take this as a general terminology. – Richard Hardy Dec 20 '22 at 07:51
-
One (of many) issues with machine learning is the habit of using existing statistical words to mean something else. If you follow through the book, you will find bias first used in terms of unbalanced coins, and then in the "expected error of estimator" sense for example in (6.33) - this is where bias-variance trade off comes in - and only in the sense you quote in the later half of the book for the constant or intercept term. – Henry Dec 20 '22 at 09:32
-
@RichardHardy That's been pretty much standard use of "bias" in the Machine Learning community. See e.g. Bishop, p. 138, or "Notation" in Cristianini and Shawe-Taylor. It developed historically; see my answer here: https://stats.stackexchange.com/a/511862/169343 – Igor F. Feb 23 '23 at 07:43
-
@Henry One (of many) issues with humans is their habit of believing that the way they do it is the right way, and every other way wrong. Bloody, "holy" wars have been fought for that reason. The word bias was not invented by statisticians and is several centuries older. Statisticians have no copyright on it. Its usage in Machine Learning developed independently from statistics and derives its meaning from electronics. After all, Machine Learning is about machines. – Igor F. Feb 23 '23 at 07:46
-
@IgorF., that is a cool piece of history! – Richard Hardy Feb 23 '23 at 10:09
1 Answers
Bias is a general term and can have different meanings.
In statistics the term is used to describe a systematic error in an estimation method. It relates to splitting up error in accuracy and precision, and bias is the lack of precision.
Bias is also used in relation to activation functions in neural networks. In that case it is the intercept as in your question. It is a term that changes how easily a particular node will 'activate' in response to the input values.
Why is it called bias?
You could say that the intercept is a term that makes the node biased to give a low/high value as response to certain input.
And how is this different from the 'bias-variance' trade-off?
Bias is a word used in different places.
In the list of uses of 'bias' above, the second use of bias has nothing to do with the bias in the bias-variance tradeoff which relates to the first use of bias.
- 77,915
-
And, as a further application of the second meaning, bias is also used in Support Vector Machines, with the same purpose. – Igor F. Feb 23 '23 at 07:48