Why use an ARMA process theory instead of just performing linear regression?

Question

I am studying ARMA processes. At the end of the course the professor told us that estimating the next sample in an arma process using past of length $p$ (so performing a projection of $X_t$ on $\text{span}(X_{t-1}, ..., X_{t-1-p})$ is the same as finding the least squares solution to $X_t = aX_{t-1} + bX_{t-2}...$, which can be just done by accumulating the "samples" as a moving window of length $p$, the prediction length, over the whole signal and then using any way to solve linear regression problems.

My question then is: why do we need the complicated theory of ARMA processes, and solutions involving covariance matrices etc, which are way more complicated to find, when we can just perform a simple linear regression? What's more, a linear regression can simply be extended to a polynomial (or kernel) regression, a model much more powerful than a linear model basically for free, giving a more powerful model than just an ARMA process.

It feels like from a practical perspective, ARMA model is just a complicated way of saying "linear regression on previous samples". Am I missing something? There surely must be an explanation of why we do all of this computation.

score 2 · Answer 1 · answered Dec 14 '23 at 21:09

I know it’s been a while since you asked the question, but I figured I’d give it a shot. AR/MA/ARMA models are a statistical tool in the analysis of time series. They are used for modeling Wide Sense Stationary stochastic processes, and as such they are not just a linear combination of previous values. They assume that the model is driven by white noise.

The MA model is a linear combination of previous error terms plus a stochastic term, and the AR model says that the current value is a linear combination of previous values plus a stochastic term. This is the key difference. General linear regression is not statistical in nature whereas ARMA models are. Due to their stochastic nature, they provide optimal regression coefficients of random processes, under certain assumptions.

score 0 · Answer 2 · answered Jan 13 '24 at 23:03

The problem is that ARMA models are not just simple linear regression.

If we have

$$ Y_t = a_1X_{t-1} + a_2X_{t-2}\cdots + a_p X_{t-p} + \epsilon_t $$

then this is simply a moving average (MA) model.

To get an ARMA model one needs to also include past outputs, the auto-regressive (AR) part of the idea:

$$ Y_t = a_1X_{t-1} + a_2X_{t-2}\cdots + a_p X_{t-p} \color{blue}{+ b_1 Y_{t-1} + b_2 Y_{t-2} + \cdots b_q Y_{t-p}} $$

which, given the recursive relationship between $Y_t$ and its past values, is not amenable to simple linear regression.

score 0 · Answer 3 · answered Jan 14 '24 at 16:06

Your equation is for a MA process, however ARMA processes are much more powerful than that. But combining the modelling ability of poles and zeros you are able to have a parametric representation of the process and/or its spectrum. For example complex conjugate pole pais you are able to accurately describe spectral peaks.

Why use an ARMA process theory instead of just performing linear regression?

3 Answers3