1

I understand in general MSE, RMSE and MAE means average distance between the actual and predicted value, and the lower the MSE, RMSE and MAE, the better the model fits the dataset.

I try to understand these concepts more thoroughly by interpreting the following result - I have used model 1 and model 2 to predict each student group’s exam score based on several independent variables. Be noted, student1,2,3 and so on are group of student, not a single student.

Based on the result, model 2, in general, have higher MSE, RMSE and MAE(meaning model1 fits the model better than model 2). However, I try to understand how do I preciously interpret the result for each of these metrics and the overall data fitting performance between model 1 and model 2?

enter image description here

1 Answers1

1

I think that you already understands what there is to understand. These statistics are used to measure the average precision of prediction models, and to compare the accuracy of different models. You already noted that:

Based on the result, model 2, in general, have higher MSE, RMSE and MAE(meaning model1 fits the model better than model 2).

You noted that the accuracy of model 1 is higher than the accuracy of model 2 in average. I do not think that you can interpret anything further.

The only additional observation I could think of is the following: For the groups "student5" and "student9", you have $MAE_1<MAE_2$ but $RMSE_1>RMSE_2$. This suggests that in these groups, model 1 produces some "significant" (do not interpret this word in a statistical way) outlier, which increases the MSE due to the squared error.

FP0
  • 456
  • Thanks. This observation is extremely helpful. That means the absolute difference,MEA, in model 1 is smaller than model 2, however, there’s outliner in the score among these 2 groups, so when we calculate the square root of the average squared differences, model 1 has larger RMSE. Can I understand this relates to a potential data quality issue and this observation has nothing to do with concluding model 1 fits the dataset better ? – user032020 Jul 24 '22 at 01:37
  • You're welcome. Yes, this is different from the general conclusion that model 1 has a better fit than model 2. An outlier observation/prediction error could be related to some data quality issue, but it could simply be an outlier regarding the predictions. – FP0 Jul 24 '22 at 10:11