Moving Averages Model

Question

In an MA model how does one fit a regression line without the help of errors? how does one get the errors without predicting?

The 1st order moving average model, denoted by MA(1) is: $x_t = \mu + w_t + \theta_1 w_{t-1}$

how is $w_t$ calculated without $x_t$ predicted? also how does $w_t$ get calculated for future forecast when there is no data to compare?

Hi @Anant. Hopefully this is helpful: https://www.youtube.com/watch?v=DlvWYKLM36U — EB3112, Apr 07 '22 at 14:05

score 1 · Answer 1 · 2022-04-09T10:28:52.217

As you've reasoned already, the errors themselves are unobservable. Whilst we do observe the regressand and the regressors, we can only recover the residuals, not the errors themselves. I think what your question is asking is how we should go about estimating a model for which one of the regressors is inherently unobservable. However, there are a couple of ways to go about estimating the $\theta$ parameter. One method mentioned by @EB3112 in the comments is to use ML estimation. However, to do this you first have to assume some distribution for the errors, usually Gaussian, or t-distributed if the tails of the errors are fatter. However, we can also get around the issue of estimating $\theta$ by using nonlinear least squares (NLS).

To do this, we first turn the MA(1) process into its AR($\infty$) representation. Consider the MA(1) model $$y_t=(1+\theta L)\epsilon_t$$ where $L$ is the lag, or left shift, operator. Under the invertibility condition that $|\theta|<1$, we can find the inverse of $(1+\theta L)$ which is defined as a sort of Taylor expansion viz. $$(1+\theta L)^{-1} := \sum^{\infty}_{k=0} (-\theta L)^k$$ so our process is $$(1+\theta L)^{-1}y_t = \epsilon_t$$ Now, using the definition of the inverse $$\sum^{\infty}_{k=0} (-\theta L)^k y_t = \epsilon_t$$ Thus, our AR($\infty$) representation is given by $$y_t=-\sum^{\infty}_{k=1}(-\theta L)^k y_t +\epsilon_t$$ This representation is useful since we assume the sequence $\{y_t\}$ is entirely observable. However, in reality it is impossible to have infinitely many observations before the current period. Therefore, we can truncate the representation, reasoning that since the process is stationary (since it is MA(1)) after some number of lags $j$ the coefficients in the AR($\infty$) representation will be insignificant. We can also see this more mechanically, since $|\theta|<1$ the coefficients geometrically decay as the number of lags increases.

Therefore, we can select a number of lags to use (say $y_1$ is the first observation and $y_T$ is the last), and minimise the residual sum of squares. So our NLS estimator for $\theta$ would be $$\hat{\theta}_{NLS} = \arg \min_{\theta} \left[\sum_{t=2}^T \left(y_t+\sum^{t-1}_{k=1}(-\theta L)^k y_t\right)^2\right]$$ Which we can solve using some numerical method. Note that we never assumed any distribution of the errors, in contrast with MLE.

I think what you are describing is the Hannan–Rissanen (1982) algorithm (see https://stats.stackexchange.com/a/582734/21546 ), isn't it? — asmaier, Aug 22 '22 at 16:49

Moving Averages Model

1 Answers1