1

I apply 2 different machine learning models in my data, a Multiple Linear Regression and Random Forest. The results were bellow:

enter image description here

Why the MAE and RMSE are higher for a higher R-squared? Both models were tested in the same test set but with different input varibales

1 Answers1

1

What you’ve described can’t happen in math, so there’s either a missing detail, a bug in your code causing you to input something other than what you intend to input, or something wrong with the Python function (the last of which I find unlikely).

I disagree with this Python implementation of $R^2$, but for the same data set, $R^2_{1,sklearn}>R^2_{2,sklearn}\iff MSE_2>MSE_2\iff R^2_{1,Dave}>R^2_{2,Dave}$. This is because $R^2$, either in the implementation you use or that way I prefer, is a strictly decreasing function of MSE.

$$R^2=1-\dfrac{MSE}{denominator}$$

(This denominator is some kind of sum of squares that is related to a model that predicts the same value every time. Your function and I disagree on what the one value should be, but you could pick $5$ or $17$ or $\pi$ as the denominator, and MSE and that definition of $R^2$ should move in opposite directions.)

If you evaluate the $R^2$ of two different models but on the same data, the denominator stays the same. Thus, increasing/decreasing $R^2$ corresponds to decreasing/increasing MSE.

(If your implementation of an MSE calculation involves an $n-p$ denominator instead of $n$ or $n-1$, then the above does not apply. This could be the kind of missing detail I mentioned in the last sentence of my first paragraph.)

MAE is a totally different metric that need not increase/decrease with an increase/decrease in MSE or a decrease/increase in $R^2$. In this linked answer of mine, I give examples where an MSE increase/decrease is accompanied by an MAE decrease/increase.

Dave
  • 62,186
  • Thank you soo much for the answer, can you just explain a little bit more about the MSE calculation involving n-p instead of n and n-1? What does the p means in this case and can you show that formula? – Alice Silva Oct 15 '22 at 21:55
  • @AliceSilva the $p$ refers to the number of parameters in a linear regression model, and it can be thought of as a way to penalize using a large number of parameters. For further details, you might consider posting a new question so others can benefit from the answer, rather than burying important material in the comments of a fairly unrelated question. – Dave Oct 18 '22 at 01:02