3

I am working on an LSTM model to predict time series data (stock prices) and I would like an opinion whether to denoise my data or not before feeding it into the model.

According to Investopedia, Denoising time series data before feeding it to your model can allow important patters to stand out, but also may lead to certain data points being ignored by emphasizing others. Hence, there is no correct answer as we will definitely have pros and cons.

Some questions to shine some light on my doubts:

What is the difference between Wavelet Transform and a Differencing Method to denoise data? Which method would be a better rational application? Can be both used together? To have a clarification on what I want to achieve, an example of differencing is applied in this post.

User1865345
  • 8,202
Jack_T
  • 31

1 Answers1

0

Differencing Method makes timeseries stationary - acts as for detrending (but not always). White noise is the main example of stationary data (but not always). Dicky-Fuller test for stationarity can help to proove stationarity or not. Stationarity (zero-mean homoscedastic noise) is the main assumption for timeseries analysis - is achieved with detrending & seasonality extraction from data series (also weighted for VaR), leaving residuals as noise.

As for (smooting) FILTERING (10.6.2 Wavelets): windowed Fourier transforms is used for analysis of stationary data. Wavelet analysis can be used for non-stationary data analysis (is more difficult to analyse non-stationary data). Anyway they (FT & WT) give the opportunity to consider both time-domain & frequency domain (as for time & frequency function).

Whether to denoise or not - depends on goals of your timeseries analysis. If the goal is to explore speed & direction of approximated data - than you should (for approximation or smoothing or forecasting). If the goal is to find outliers in the noise (some extreme values) - then you don't need to denoise your data.

The main difficulty of all AR-family methods of TS-analysis is that they are hugely parametric (as well as GARCH & ARCH). You can try to find MachineLearning solutions (e.g. Autoencoder LSTM based) or create NeuralNetwork (with lstm or convolution layers) as non-parametric solution [as of Bayesian Structural Time Series]. [need huge data - much more than 1000, used when need to solve the problem of "curse of dimensionality"]

But the main idea is to increase Signal-to-Noise ratio to get pure signal - & problem of Signal-to-noise ratio is generally solved in PCA. So, getting 1st PC - you further work with data free of noise (as of orthogonal projection), I suppose... further removing noise again (with FT or WT) seems unsuitable for me

(if I'm mistaken - hope somebody will correct) here: "It does not eliminate noise, but it can reduce noise"

P.S. the simplier - the better... Often simple Exponential smoothing gives better forecast than ARIMA

JeeyCi
  • 211