It is a very basic question of RNN (LSTM). Here is the basic structure of RNN:
for a sample $(X^i,Y^i),$
input: $X^i = (x_1,\cdots,x_T);$ output: $Y^i = (y_1,\cdots,y_T).$ $$z_t = Uh_{t-1} + W x_t + b,$$ $$h_t = f(z_t) = f(Uh_{t-1} + W x_t + b),$$ $$y_t = g(h_t).$$
If we want to use RNN for prediction of time series: $(x_1,\cdots,x_n)$ like
ARMA:
input: $X^i = (x_i,\cdots,x_{i+T-1});$ output: $Y^i = x_{i+T}.$
Then how could we apply RNN (LSTM) to prediction of time series? Is it just changing into: $$y_T = g(h_1,\cdots,h_T)?$$
And is the different between RNN and ARMA for the prediction of time series just the difference between non-linear model and linear model?
