results of a regression predictor

Question

I have a neural network trained to predict values from timeseries. the target (which is hopefully to be predicted by NN) is always in range 0.0~1.0, and has these statistic features:

count     26282
mean      0.409029
std       0.280781
min       0.000000
25%       0.180000
50%       0.330000
75%       0.570000
max       1.000000

and has this kind of distribution :

When I set the regression model to calculate the loss using MSE method, it yields a loss value of 0.024 and when using MAE, loss will be around 0.11

Questions:

since the values which are being predicted are not in normal distribution, does the common facts I read about variance and stddev stand valid here? I mean can I judge the MSE and MAE values based on variance ?
how else can I assess whether this predictor is doing a good job or not?

EDIT: please someone at least direct me to some source. keywords welcomed

Welcome to Cross Validated! Could you please expand on your confusion in question #1? What does a normal distribution have to do with anything in your mind? What are the common facts you’ve read about variance and standard deviation? Variance and standard deviation of what? $//$ This seems to answer your second question. — Dave, Mar 26 '24 at 11:34
the std deviation and variance of the targets that are gonna be predicted by the neural network. these stats are given in the question. what I meant is : how I'm gonna judge the accuracy of this predictions? the MSE and MAE values obtained are from the predicted values, compared to the actual values (targets) — Bikay, Mar 26 '24 at 11:56
What do you mean “how are you going to judge them”? And what does normality have to do with this? — Dave, Mar 26 '24 at 13:20
@Dave by judging the mse and mae values, I meant to make an estimation of how successful the NN has been in predicting the targets. and by mentioning normality, I'm trying to make sure that the common procedures of evaluating a regression predictor (considering MSE and R2) is not only valid if the targets are distributed normally. (i.e. whether the distribution of the targets is relevant to usefulness of R2 value) — Bikay, Mar 26 '24 at 20:47

score 0 · Accepted Answer · answered Mar 27 '24 at 15:29

When there is an assumption of a Gaussian-distributed error term (not a Gaussian distribution of all $y$ values, which is pretty much never important (common misconception that it is)), then minimizing mean squared error is equivalent to maximum likelihood estimation. However, mean squared error is a useful evaluation criterion when we are not willing to make such an assumption, too. For instance, minimizing mean squared error elicits conditional means. Mean squared error quite harshly penalizes predictions that are badly wrong in that, missing by $1$ results in a penalty of $1$, but doubling the miss to $2$ results in a penalty of $4$ rather than a penalty of $2$. This might not be desirable behavior, but it might be.

In summary, no, uconsidering mean squared error should not be contingent on a Gaussian assumption.

The $R^2$, depending on how you do the calculation, is just a function of the mean squared error.

$$ R^2=1-\left(\dfrac{ \overset{N}{\underset{i=1}{\sum}}\left( y_i-\hat y_i \right)^2 }{ \overset{N}{\underset{i=1}{\sum}}\left( y_i-\bar y \right)^2 }\right) =1-\left(\dfrac{ N\times MSE}{ \overset{N}{\underset{i=1}{\sum}}\left( y_i-\bar y \right)^2 }\right) $$

With that in mind, such an $R^2$ calculation does not require a Gaussian assumption, either. I discuss here how other $R^2$ calculations present problems.

Finally, to determine how good your model performance is, you have to give it some context. The $R^2$ calculation means that $R^2$ can be seen as a comparison of how your model performs to how a reasonable benchmark or baseline model performs. However, there is more to the story than just a comparison to a naïve model that always predicts $\bar y$, no matter the feature values, and I like that linked answer by mkt and consider it worthwhile reading.

results of a regression predictor

1 Answers1