I have a dataset with approximately 2500 observations and 50 variables. The response variable is numerical, so my objective is to build a regression model. I have built one penalized linear regression model and one xgboost regressor model.
The linear model has obtained a MSE of 7.31 and a R2 score of 0.62 The xgboost model has obtained a MSE of 8.19 and a R2 of 0.66
So one model has the smallest MSE but the other has the largest R2. Which one is better? I have read that there are some metrics that are called "proper", meaning that it is mathematically proven that the better the metric, the better the model. I was wondering if either the MSE or R2 are proper.
r2_score, which receives a vector of true response values and a vector of predictions and from what I have seen, it computes the value as 1 - (residual sum of squares / total sum of squares) – Alberto Perez Martinez Jun 12 '23 at 16:50r2_scorefunction for each model? – Dave Mar 28 '24 at 11:07