Continous vs categorical predictions

Question

I have been thinking about this recently.

I want to predict tomorrows price for a certain stock, lets say apple. For this I can use many different models; regression analysis, random forest, RNN... etc. As we are talking about a time dependent variable, our test set must be the last n observations.

As expected, this prediction is probably going to ve very bad and difficult to implement in practice.

But what happens if instead of predicting the value of apple share tomorrow, we transform the problem into a classification problem and try to predict if the stock goes up or down? This approach will allow us to treat pur observations as independent, allowing us to shuffle them and take various test and train sets without overfitting too much the data. In some academic papers this approach has proven to be more precise than a continuos prediction.

My questions are the following:

1) what makes this behave better than continuos prediction? intuitively, it seems that we are taking out the trend component making its mean independent of time. We are also reducing the variation and possible values for the dependent variable.

2) if we were to continue with the second approach, should we treat the explanatory variables for this prediction to make sense? For example if one of these variables was the S&P 500 index should we take differences? If we don’t I dint see how something with tren would help us predict a binary variable.

I don't see why would you consider this approach to be better? If you are interested in sign only, then you could also use something like linear regression and sign-transform the c predictions -- however I don't see why such simplified predictions ought to be better? — Tim, Feb 20 '18 at 15:40
If your stock price is (say) autocorrelated, then whether it goes up or down over a day will also be autocorrelated, so you can't treat them as independent and shuffle them. — Stephan Kolassa, Feb 20 '18 at 15:40
Well, the price being autocorrelated does not imply the returns(variation) are autocorrelated. In general financial returns behave like iid variables with zero mean. Now answering Tim, is not that I consider it better, but it usually is more accurate. Although it can seem simplified, predicting the sign is of course very useful. My question is, is what I stated correct? Does transforming a continuos prediction into a simpler sign prediction benefit me? And if so, is my logic correct? — juan freire, Feb 20 '18 at 15:48
@StephanKolassa I do not think this must be true. Consider a simple random walk $y_{t} = y_{t-1} + \varepsilon$ for $\varepsilon\sim$i.i.d.: $y_{t}$ is certainly correlated with $y_{t-1}$, but the perturbations in each time $t$ are i.i.d. and not at all correlated. — Alexis, Feb 20 '18 at 17:26
juan freire, I think you are intuitively stumbling upon an idea akin to that of differencing as an approach to creating a cointegrating relationship: if $y_{t}$ is a nonstationary process, then $y_{t-1}$ is also a nonstationary process of precisely the same order. Differencing these two quantities will produce a stationary process about which we can have more confidence in our inference assuming that the data generating process of $y_{t}$ is based on a single lag of time 1. — Alexis, Feb 20 '18 at 17:31
@Alexis thanks for the reply, I have also though about that, but the following question arises; if we run a classification random forest over the data set where instead of returns we have a binary variable, and a regression random forest where we have a stationary process created with the log returns of the series, would this predictions give similar results? Or would the extra simplification of the binary model give a better prediction? (I understand that errors in both models arent comparable, but lets assume you predict a positive return if the regression model predicts a positive value). — juan freire, Feb 21 '18 at 00:44
The binary perturbation model would to a poor job of representing magnitude of predicted change. That might be ok, that might not be ok, depending on your needs. For a discussion of why sometimes numerical precision of a model is less important than other qualities, see Levins, R. (1966). The strategy of model building in population biology. American Scientist, 54(4):421–431. — Alexis, Feb 21 '18 at 00:48
@juanfreire how do you conclude it's more accurate? Predicting only the sign needs less accuracy then predicting also the magnitude. I guess that in the end you also need to know the magnitude, so knowing the sign would not be of much help... — Tim, Feb 21 '18 at 06:36
@Tim ok but if with the model you predict the magnitude you take the sign as a proxy for the sign prediction? Would this be as accurate as the binary model? — juan freire, Feb 21 '18 at 12:46
...it depends on many things including the metric you use for measuring the "accuracy", e.g. classification model may optimize such metric directly. But I wonder if in such scenario the information about sign is of any use for you, it could be +0.0001, or +100 000, would you consider them the same? I doubt. — Tim, Feb 21 '18 at 12:56
@Tim thats another concern I think. In my case, I work in quantitative trading firm, knowing the sign is very very useful since you already know that the minimum amount a security moves is enough to maje profit. Also the objective in this case is not implement an algorithmic strategy, but obtain an additional source of information(even if you don’t use it later) — juan freire, Feb 21 '18 at 13:41

Dave · Answer 1 · 2024-02-28T17:35:29.913

knowing the sign is very very useful

To me, this says it all. If you have reason to believe that knowing just the sign is enough to contribute to your trading strategy, then go ahead and predict the sign. There are all of the usual caveats about overfitting and the difficulty of making financial predictions, but if you are careful in developing your model and do produce one that you verify is reliable in predicting useful information, that seems like it would be regarded as an accomplishment!

I have my doubts about how useful it is to know just the sign, however, despite this appearing to be a topic in the quantitative finance literature.

If you just buy when the prediction is a gain and sell when the prediction is a loss, you have no idea how large those are and if they will be devoured by trading fees (perhaps taxes, too). You could get it right every time yet lose money (or at least underperform a benchmark).
Even setting aside trading fees and taxes, you can get almost every sign prediction right yet still lose money if the few times you are wrong are big misses. For instance, if you predict right the first four days of the week and make $1$ each day and then predict wrong on Friday and lose $7$, you are in the red, despite what seems like a decent accuracy of $80\%$.

These two concerns are related to the fact that the cost of decisions is an important consideration, and your classifier does not give that. This answer discusses this in more detail (granted, in a different context).

My answer here gets into some other issues I see with binning the outcome variable as is proposed in this question.

Continous vs categorical predictions

1 Answers1

Linked