I have been thinking about this recently.
I want to predict tomorrows price for a certain stock, lets say apple. For this I can use many different models; regression analysis, random forest, RNN... etc. As we are talking about a time dependent variable, our test set must be the last n observations.
As expected, this prediction is probably going to ve very bad and difficult to implement in practice.
But what happens if instead of predicting the value of apple share tomorrow, we transform the problem into a classification problem and try to predict if the stock goes up or down? This approach will allow us to treat pur observations as independent, allowing us to shuffle them and take various test and train sets without overfitting too much the data. In some academic papers this approach has proven to be more precise than a continuos prediction.
My questions are the following:
1) what makes this behave better than continuos prediction? intuitively, it seems that we are taking out the trend component making its mean independent of time. We are also reducing the variation and possible values for the dependent variable.
2) if we were to continue with the second approach, should we treat the explanatory variables for this prediction to make sense? For example if one of these variables was the S&P 500 index should we take differences? If we don’t I dint see how something with tren would help us predict a binary variable.