6

For some estimator $\hat{\theta}$, we have the "bias-variance trade-off": $MSE(\hat{\theta}) = bias^2(\hat{\theta}) + var(\hat{\theta}).$

When I think of a trade-off, I would expect that as the variance goes down, the squared bias would go up, and vice versa. But I'm not sure this is the case here, since $MSE(\hat{\theta})$ is not fixed as $\hat{\theta}$ as varies.

Example: Consider two silly estimates for the mean of a single normally distributed data point $x \sim N(\theta, 1)$: $\delta_0(x) = 0$ and $\delta_1(x) = 1$. Both of these has the same variance of $0$. Furthermore, $bias^2(\delta_0) = \theta^2$ and $bias^2(\delta_1) = (1-\theta)^2$. Therefore, both estimators have the same bias but different squared biases (assuming $\theta \ne \frac{1}{2}$).

So in what sense is there a trade-off here?

theQman
  • 677
  • 4
    The bias-variance tradeoff is a "rule of thumb" and not a law. No one claimed it was a law. – Jake Westfall Jun 16 '17 at 20:28
  • 1
    @JakeWestfall could you elaborate? – Mark White Jun 16 '17 at 20:33
  • 4
    @Jake Isn't the concept of a "minimum variance unbiased estimator" tantamount to such a law? Its very existence shows that if one can find any estimator (in the same family) with lower variance, it must increase the bias above zero, right? Qman: Your $\delta_i$ are not written like estimators. Estimators are functions of the data, not of the parameter. It isn't at all clear how you are computing the "bias" or even what you might mean by it. – whuber Jun 16 '17 at 21:07
  • Right. I think the $\delta$'s should be functions of $x$, not $\theta$. – theQman Jun 16 '17 at 21:30
  • 4
    It's possible for an estimator to be dominated by another in both bias and variance simultaneously. This doesn't contradict the fact that there is a trade-off between those that are on the "boundary" of the feasible set in bias-variance space (the ones that achieve the lowest variance for a fixed level of bias, like MVUE). – Chris Haug Jun 16 '17 at 23:09
  • "both estimators have the same bias" ... they don't. Do you mean to say "variance" there? – Glen_b Jun 17 '17 at 05:51

1 Answers1

3

I share your skepticism that there is a tradeoff. A typical way to think about the bias-variance decomposition of MSE, such as in regularized regression, is that we accept a bit of bias in our estimator in exchange for a large reduction in variance. However, we do this to achieve lower MSE, not to maintain the MSE. Thus, while I understand the "trade" being used to describe trading your unbiased, high variance estimator for a slightly biased, low variance estimator, "tradeoff" to me implies keeping MSE constant, and I try not to describe it as a "tradeoff", preferring to refer to a bias-variance "decomposition".

Dave
  • 62,186