2

I was studying Moving Average Processes, and wanted to ask why adding a bunch of weighted noise terms is not just a noise term. I understand the operations involving mean and variance in a mathematical pov, but I want to address this doubt more intuitively. Could someone help please?

  • Could you please tell us what your concept of a "noise term" might be? From some perspectives the answer to "is not just a..." is yes and from other perspectives it is no. We need to understand your perspective. – whuber Jan 09 '24 at 15:30
  • @whuber I really apologize for the late reply; I was caught up in college submissions so could not get onto this site. Onto your question, apart from the obvious mathematical definition which has been given to me, I understand noise as just a random value. – insipidintegrator Jan 17 '24 at 16:15
  • "Just a random value" is not sufficiently specific. "Noise" usually refers to certain models of error or uncertainty. The Wikipedia article on "white noise" might be a good place to begin reading about the distinction. – whuber Jan 17 '24 at 18:46
  • @whuber I recommend thus question be reopened. It made enough sense that it two thumbs up and two answers. – Huy Pham Jan 18 '24 at 02:16

2 Answers2

1

You know that ordinary regression will be failed if there is no data about $X$ whichs explain the $Y$. Well, MA is kind of regression using the noise as idependent variables, it's under the assumption that we don't know what is the noise exactly but there are some correlations.

1

To elaborate on another answer, "noise" it is, but once there are correlations, we can use it to predict. Consider $${{y}_{t}}={{\theta }_{0}}{{\varepsilon }_{t-1}}+\,\,{{\varepsilon }_{t}},\,\,\,\,\,\,{{\varepsilon }_{t}}\sim \text{WN}\left( \sigma _{\varepsilon }^{2}=1 \right),\,\,\,\,{{\theta }_{0}}\in \mathbb{R}$$

The autocovariance function is $$E\left( {{y}_{t}}{{y}_{t-k}} \right)=\left\{ \begin{matrix} 1+\theta _{0}^{2}\qquad k=0 \\ {{\theta }_{0}}\qquad k=1 \\ 0\qquad k>1 \\ \end{matrix} \right.\Rightarrow \,\,\,\,\left\{ \begin{matrix} E\left( y_{t}^{2} \right)=1+\theta _{0}^{2} \\ E\left( {{y}_{t}}{{y}_{t-1}} \right)={{\theta }_{0}} \\ \end{matrix} \right.$$

You can use either moment condition, or both in an overidentified Generalized Method of Moments approach, to estimate $\theta_0$, and then we can predict $y_t$ after observing $y_{t-1}$, through, for example,

$$\left\{ \begin{matrix} {{y}_{t}}={{\theta }_{0}}{{\varepsilon }_{t-1}}+\,\,{{\varepsilon }_{t}} \\ {{y}_{t-1}}={{\theta }_{0}}{{\varepsilon }_{t-2}}+\,\,{{\varepsilon }_{t-1}} \end{matrix} \right.\Rightarrow \,\,\,\,y_t = \theta_0y_{t-1} - \theta_0^2 \varepsilon_{t-2} + \varepsilon_t.$$

OP'S COMMENT,

what I was asking was not the mathematical framework which explains the correlations, but more specifically, WHY those correlations exist when we are effectively adding a bunch of noise terms.

RESPONSE TO OP's comment

What we model here is a) the assumption of per-period white noise ($\varepsilon_t$) but b) that there is some other phenomenon ($y_t$) that is determined as a linear function of these white noises. So the "correlations" do not exist a priori, but they arise because each $y_t$ is determined by a portion of the same history of the white noise process. This is what creates the correlation, e.g between $y_{t+1}$ and $y_{t}$ - they both are partly determined by $\varepsilon_t$, with different strengths. They are not correlations between the white noises but between they $y$-s, because by construction, the partly depend on the same $\varepsilon$. This is what my last mathematics shows, and it is what allows prediction, on the data we have (the $y$-s alone).

Maybe what you are asking is: "is it reasonable to assume that such a $y_t$-process exists in the real world?"

But that is a question always haunting a model.