5

Basically I have a case where under-predictions are worse than over-predictions. Is there a way to penalize the linear regression model during training according to some predefined ratio?

E.g. I want to define that, for an actual value 10, predicting 9 and predicting 12 has equivalent penalty. (And not 9 and 11 as per default).

I guess my question is, is this something that is acceptable to do in the first place, and how would one best go about it?

PS. Maybe there is a reasonable approximation to solving this - without meddling with the least squares function. E.g. I've tried increasing the y (output) values by 1-2% and that moves me in the right direction, though requiring to incrementally test for the best % increase (not the worst thing..).

Martin
  • 151
  • If you do training as in training and validation, you have to specify the loss function. Just specify one that suits you. Is that what you mean by to penalize the linear regression model? – Richard Hardy Mar 20 '15 at 12:21
  • Sorry my terminology is so dodgy. Yea so basically I would like to add a factor to the loss function during least squares data fitting. Unfortunately in scikit-learn doing this means having to meddle with the underlying scipy and then C code.. so I'm wondering if there's some alternative solution, but I guess not. – Martin Mar 20 '15 at 12:27
  • OK, that's not what I was thinking of. That means you would not choose an OLS-fitted model based on a validation set using a skewed loss function but would rather want a model that uses a skewed loss function for fitting itself. I do not have an idea how to implement that. – Richard Hardy Mar 20 '15 at 12:38
  • Expectiles might be a relevant keyword. – Richard Hardy Mar 06 '17 at 14:30

1 Answers1

1

My first thought is to do something like the quantile regression loss.

$$ l_{\tau}(y_i, \hat y_i) = \begin{cases} \tau\vert y_i - \hat y_i\vert, & y_i - \hat y_i \ge 0 \\ (1 - \tau)\vert y_i - \hat y_i\vert, & y_i - \hat y_i < 0 \end{cases} $$

Then we add up each individual $l$ to get a loss $L$ for the whole model. $$ L_{\tau}(y, \hat y) = \sum_{i=1}^n l_{\tau}(y_i, \hat y_i) $$

You can figure out $\tau$ to balance the costs of high misses and low misses. I will demonstrate using your example where missing high by $2$ and missing low by $1$ should give equivalent penalties.

$$ (1-\tau)\vert 10 - 12\vert = \tau\vert 10 - 9\vert\\ (1-\tau)\vert -2\vert = \tau\vert1\vert\\ (1-\tau)\times 2 = \tau\times 1\\ 1-2\tau = \tau\\ 1 = 3\tau\\ \implies\\ \tau = 1/3 $$

Dave
  • 62,186