Time-series loss function combining RMSE and Classification accuracy

Question

I am working on a time series regression problem applied to finance. I am interested in predicting the price change of a stock and how much will it change by.

I have framed this as a regression problem using MSE as my loss function. This gives good MSE accuracy. I would like however improve on the directional accuracy (will the price move up or down) by giving this information in the loss function (since the sign is ignored in the MSE loss function). I understand one way to improve of this is to switch to a classification problem. Switching to classification on the other hand would not give me any information on the amount of change.

Is there any way to customise the loss function in such a way so as to account for bother MSE and direction? Or is there any loss function recommended for this type of problem?

Thanks

Why sign does matter to you? How exactly do you want the loss function to react on the sign? Could you describe in greater detail the underlying prediction problem you are trying to solve? Why squared error does not work here? — Tim, Jun 27 '22 at 13:10
I am evaluating my models by looking at RMSE and also by looking at how many times I get the correct sign. Ideally I would like the loss function to penalise a prediction heavier if the sign is wrong.
Assume y_true=0.5, y_pred_1=1.5 and y_pred_2=-0.5. MSE would penalise y_pred_1 and y_pred_2 equally. I would like a loss function penalising y_pred_2 more since the sign is wrong. — prax1telis, Jun 27 '22 at 13:24
@Tim: if I can predict that a stock will go up, I will buy it, and if I can predict it will go down, I will sell it... I do understand why that may well be more important than knowing by much it will go up or (or!) down. — Stephan Kolassa, Jun 27 '22 at 14:03
Sure, I wanted to make sure we're not missing any important context. — Tim, Jun 27 '22 at 14:15
@StephanKolassa Until you start paying capital gains tax on your sale or have to pay a fee to do the trade and have that fee wipe out your small gain — Dave, Jun 27 '22 at 14:38
Also, some of those up/down predictions are going to be wrong, and those wrong predictions don't have to be balanced out by right predictions. I am content to be a squillionaire who gets it right once out of ten times, if the time I get it right makes up for the misses. — Dave, Jun 27 '22 at 15:01

score 2 · Accepted Answer · answered Jun 27 '22 at 14:13

One possibility would be to run two models on your time series:

Model 1 gives a numerical prediction of the target variable, e.g., the price change. You would evaluate this using MSE.
Model 2 is a classification algorithm and outputs the predicted probability of a positive price change. You would evaluate this predicted probability using the Brier score or the log loss. (The Brier score is of course nothing else than the MSE applied to a 0-1 realization, 0 if the price change is negative and 1 if it is positive.) You don't want to evaluate Model 2 predictions using accuracy, especially not in an inflationary environment - there will likely be more positive than negative price changes, and accuracy will bias your "positivity" predictions upwards.

If you then want a single KPI, you can combine the two error metrics. You would need to decide which aspect is more important to you and how much, and you would need to scale everything, since the probabilistic predictions are naturally constrained and scaled between 0 and 1, while the numerical predictions are unconstrained and scaled by the underlying asset.

The alternative I would go for would be to use probabilistic predictions for the price change, i.e., density forecasts. You can evaluate density forecasts using proper scoring rules, and given a density forecast, you can easily derive an expectation point forecast as well as a prediction of the probability the outcome is positive.

In any case, in finance you are of course always going up against the Efficient Market Hypothesis. There are many highly competent people trying to do the exact thing you are looking at, with enormous resources - and to the degree they are successful, they are destroying the signal itself. Thus, be aware that if there is a solution to your problem, somebody will likely already have found it, applied it, and thus obliterated its usefulness.

Thanks this is very informative. Would you say creating a loss function that optimises for pearson correlation would make sense in this problem setting? — prax1telis, Jun 27 '22 at 16:14
No, I don't think so. Your predictions could be systematically biased high and be perfectly correlated with the true values, but give a very expensively optimistic view of the future. (I have never understood why people would evaluate predictions using correlation measures.) — Stephan Kolassa, Jun 27 '22 at 16:16
That's a good point. The highest rated answer in the attached link gives an explanation on why to optimise for Pearson correlation. I have come across the problem mentioned - predicting 0 diff gives the highest MSE. This is another reason I considered moving away from MSE and use Pearson corr instead. https://stats.stackexchange.com/questions/228373/use-pearsons-correlation-coefficient-as-optimization-objective-in-machine-learn — prax1telis, Jun 28 '22 at 08:45

Time-series loss function combining RMSE and Classification accuracy

1 Answers1