2

This was a homework problem where I was asked to find explicit expression that minimises the cost function.

I found the solution as :

$\hat{\theta} = (X^TX + \lambda I)^{-1}X^Ty$

Now the problem further asks :

If lambda <=0 is this solution, still equivalent to minimising the cost function?

I'm not able to figure out how do I answer this, my understanding is that it will still minimise the cost function but could potentially make the covariates unstable?

aroma
  • 123
  • 1
    Necessarily $\lambda$ must exceed the negative of the smallest eigenvalue of $X^\prime X,$ for otherwise the objective function is unbounded. Exploiting the SVD of $X$ provides an easy way to see this. – whuber Mar 07 '24 at 13:36

1 Answers1

1

When $\lambda < 0$ the cost function may not have a minimum.

As an example, consider the univariate case where $X$ has one column $x$. Then the cost function is $$f(\theta) = \sum_{i=1}^n (y_i - \theta x_i)^2 + \lambda \theta^2.$$ This is a quadratic function in $\theta$ which has leading coefficient $\lambda + \sum_{i=1}^n x_i^2$.

  • If $\lambda = -\sum_{i=1}^n x_i^2$, then $f$ is actually linear in $\theta$ and has no minimum.
  • Iif $\lambda < -\sum_{i=1}^n x_i^2$, then $f$ is a concave quadratic, and the critical point $(x^\top x + \lambda)^{-1} x^\top y$ is actually a maximizer of $f$.

On the other hand, when $\lambda > 0$, then $f$ is a stictly concave quadratic and always has a unique minimizer given by your $\hat{\theta}$.

angryavian
  • 2,328