R-Squared for real valued label under non linear regression learner

Question

Below are my questions of R-Squared for real valued label under non linear regression learner. It may be a large problem, if there is no easy answer, could you give me some references？

Firstly for the real valued label, except for R-squared, is there any good value to evaluate the performance of fitting? I know that small MSE, MAE, e.t.c usually mean the good fitting. However they may be not as intuitive as a ratio like R-Sqaured (how small is good？).

Secondly does R-Squared of non linear regression learner really make sense? Since linear regression has sum of squared error decomposition, R-Squared is always between 0 and 1 and the closer R-sqaured gets to 1, the better fitting is. However other learner cannot guarantee sum of squared error decomposition, R-Squared even can go to large negative. Here we really want the small residual error, however does it make sense to compare with total error? One way to understand is that the total error (variance of sample) is a non-model dependent value, therefore we can use it as a benchmark?

I address your second question here. Regarding the first question, what constitutes a good $R^2?$ — Dave, Jun 04 '22 at 11:51
@Dave Here we really want the small residual error, however does it make sense to compare with total error? So is there any value to intuitively describe how small residual error is good? Or could I understand total error (variance of sample) is a non-model dependent value, therefore we can use it as a benchmark? — user6703592, Jun 04 '22 at 11:53
What I meant was more song the lines of judging the $R^2$ value. There are fields where $R^2=0.5$ is unbelievable, and I can imagine problems where $R^2=0.9$ is rather pedestrian. — Dave, Jun 04 '22 at 13:56
Have you figured out what constitutes a good $R^2$ value? (I have yet to do so, and for this reason, I’m not a huge fan of $R^2$. I fear it can mislead people into thinking in terms of grades in school.) — Dave, Jun 13 '22 at 06:48

score 1 · Accepted Answer · answered Mar 10 '23 at 05:11

$R^2$ can get us thinking in terms of letter grades in school, where $R^2=0.9$ is the kind of score that seems like an $\text{A}$-grade that makes us happy while $R^2=0.4$ is the kind of score that seems like an $\text{F}$-grade that makes us sad. Some people see this as an upside to $R^2$. I do not agree. There may be situations where $R^2=0.4$ is outstanding (I have seen papers in top journals where $R^2 < 0.10$), and some situations where $R^2=0.9$ is rather pedestrian. As is remarked here by our mkt:

There is no context-free way to decide whether model metrics such as $R^2$ are good or not.

What $R^2$ does do, at least depending on how you perform the calculation (I have a strong opinion), is compare the performance of your model to the performance of a "must-beat" model. Failing to beat that performance manifests in $R^2\le 0$, with $R^2<0$ indicating performance even worse than the "must-beat" level. In that sense, $R^2$ can be useful.

Metrics like $(R)MSE$ and $MAE$ are helpful, too, since they give a sense of how large typical error magnitudes are. If you look at those values and see that about half of your errors have magnitudes of over a centimeter, you might know that you need to be within a milimeter almost every time in order to have a model worth anything (maybe a competitor has such performance, or maybe you just know the science and requirements of your problem). Likewise, you might be elated with that centimeter, if you know that the tolerance is three centimeters.

As is discussed in the linked answer, however, no performance metric tells the whole story unless you give it context.

R-Squared for real valued label under non linear regression learner

1 Answers1