Estimation of ARMA from state-space generated data

Question

There is a simple book-problem: the following state-space model $$ z_{t} = x_{t} + v_{t}\\ x_{t} = \phi x_{t-1} + w_{t} $$ where $v_{t}\sim \mathcal{N}(0,\sigma^{2}_{v})$ and $w_{t}\sim \mathcal{N}(0,\sigma^{2}_{w})$ are independent, is equivalent to ARMA(1,1) $$ z_{t} = \phi z_{t-1} + \theta \varepsilon_{t-1} + \varepsilon_{t}, $$ where $\theta = - \phi \frac{\sigma_{v}}{\sqrt{\sigma^{2}_{v} + \sigma^{2}_{w}}}$ and $\varepsilon_{t}\sim \mathcal{N}(0,\sigma^{2}_{v} + \sigma^{2}_{w})$ are i.i.d.

The prof can be found, for example, here http://www.stats.ox.ac.uk/~reinert/time/notesht10short.pdf

Next, let us generate 5000 data points from states-space model with parameters, for example, $\phi = 0.95$, $\sigma_{v} = 0.08$, $\sigma_{w} = 0.04$ and then, based on this data, we estimate the parameters of equivalent ARMA(1,1), i.e. $\phi$ and $\theta$.

Based on 5000 points, the estimates are $\hat{\phi} = 0.952$ and $\hat{\theta} = -0.571$, while the true value of $\theta$ is $$ \theta = - \phi \frac{\sigma_{v}}{\sqrt{\sigma^{2}_{v} + \sigma^{2}_{w}}} = -0.849 $$ Why doesn't it work? The "equivalence" of similar, but a bit more complicated models was discussed in Superposition of random walk and autoregressive process

R-code is

phi = 0.95      # AR coefficient
sigma_v = 0.08  # standard deviation of observation noise
nSample = 5000  # sample size
fVal = 0        # first value of the simulated process
sigma_w = 0.04  # standard deviation of transition noise

simulate <- function(nSample, phi, sigma_v, sigma_w, fVal) { 
  noise_v = sigma_v*rnorm(nSample)
  noise_w = sigma_w*rnorm(nSample)
  z = rep(0, nSample)
  x = rep(0, nSample)
  x[1] = fVal
  z[1] = fVal + noise_v[1]
  # State-space 
  for (i in 1:(nSample-1)) {
      x[i + 1] = phi *x[i] + noise_w[i]
      z[i + 1] = x[i + 1] + noise_v[i + 1]
  }
  return(z)
}
dt = simulate(nSample, phi, sigma_v, sigma_w, fVal)

forecast::Arima(dt, order=c(1,0,1), include.mean = FALSE)

The python code is the following:

import numpy as np
import pandas as pd
import statsmodels.api as sm
def simulate_z(nSample, phi, sigma_v, sigma_w, x_f):
    noise_v = np.random.normal(0, sigma_v, nSample)
    noise_w = np.random.normal(0, sigma_w, nSample)
    z = np.zeros(nSample)
    x = np.zeros(nSample)
    z[0] = x_f
    x[1] = x_f
    for period in range(1, nSample):
        z[period] = x[period] + noise_v[period]
        if period < nSample - 1:
            x[period + 1] = phi*x[period] + noise_w[period+1]
    return z
"""
values of the parameters for simulation
"""
phi = 0.95         # slope
nSample = 5000     # sample size
x_f = 0            # first value of the simulated process
sigma_v = 0.08     # standard deviation of observation noise
sigma_w = 0.04     # sd of transition noise
"""
generate some data
"""
dt = simulate_z(nSample, phi, sigma_v, sigma_w, x_f)
dt = pd.DataFrame(data=dt)
dt.columns = ['data']
"""
estimation
"""
model = sm.tsa.ARMA(dt['data'].values, (1, 1)).fit(trend='nc', disp=0)
print("estimated parameters [phi, theta] ", model.params)
print("true values [phi, theta] ", [phi, -phi*sigma_v/np.sqrt(sigma_v**2 + sigma_w**2)])

I don't have time and only looked quickly at the other link. but the thing you derived in the other link implies that it's ARIMA(1,1,1) not an ARMA(1,1). That might be causing the issues above. A piece of advice: In general, the equivalences are lying around somewhere in the time series literature so it's best to check them before going on to do something like above. Hopefully what I said is your problem ? — mlofton, Nov 11 '19 at 13:45
Dear mlofton, wait please. The problem here has nothing to do with the one in another link! In that link we discussed THE EQUIVALENCE between ARMA and state space. Here I tried the simplest case possible and did some simulations. @Richard Hardy did not totally agree with "equivalence" and suggested to do some simulations. — ABK, Nov 11 '19 at 14:02
no, look at the link http://www.stats.ox.ac.uk/~reinert/time/notesht10short.pdf it is is correct. Forgot about https://stats.stackexchange.com/questions/433968/ar1-model-with-autoregressive-intercept. As I have written above that was totally different problem — ABK, Nov 11 '19 at 14:44
@ ABK O.K. My apologies. I was looking at the wrong link. I will make two suggestions. 1) if $\phi = 1$, that model becomes an ARIMA(0,1,1). So, you could set $\phi = 1$ and check if you get a match in that case. B) Based on your code, you seem like a decent programmer, so I STRONGLY suggest the use of the DLM package in R. It's mature, tested and has all the state space facilities, pred error decomp and a function for converting from arima to ss ( and I think the other way around ). If you used R-DLM package and got different answers, I'd be truiy shocked. With Python, I can't be helpful. — mlofton, Nov 12 '19 at 01:18
This is the book but, if you don't want to buy it, there are enough articles and vignettes to get your familiarized. IMHOI, it's the best state space package in R but I'm biased because I studied state space using the same notation-framework. https://link.springer.com/book/10.1007%2Fb135794 — mlofton, Nov 12 '19 at 01:26
Dear @mlofton, I have added R-code. The result is exactly the same... There is something strange about equivalence... — ABK, Nov 12 '19 at 10:12
That's puzzling. When I have time, I'll play with it . There is something weird going on but the weirdness should be able to be figured out. I don't have time at the moment (I'll try to play with it in the next couple of days ) but it maybe constraint needs to be "told" to the arima estimation approach. If arima doesn't "know" about the constraint, it's possible that it finds a "better" solution. That's just a thought of course. For now, if you want , put in $\phi = 1$ and check if same thing happens. ARIMA(0,1,1) is an easier model to deal with. — mlofton, Nov 12 '19 at 15:33
Dear @mlofton, it is strange, indeed... I am not sure which constraint you mean... all parameters are ok... — ABK, Nov 12 '19 at 15:42

Konstantin · Accepted Answer · 2019-11-20T13:24:00.510

5

Answer:

There is a mistake in the formula for $\theta$.
The correct computation must align autocovariances of the MA components of two representations.
The correct formula is

$$ \theta = \frac{\sqrt{\xi^2-4} -\xi}{2}$$

where $\xi:= \phi + \frac{\sigma^2_v+\sigma^2_w}{\phi \sigma^2_v}$. Substituting the chosen values for $\phi,\sigma_v,\sigma_w$ gives $\theta = -0.6004940561846299$.

Details:

There is a mistake in the lecture notes you are referencing.

Both these lecture notes and this post refer to An Introduction to Time Series Analysis and Forecasting, by Brockwell and Davis, where this subject is treated correctly.

In fact, to obtain the new ARMA representation you have to choose MA weights and the variance of a white noise process entering this MA so that the autocovariances of the new process are the same as the autocovariances of $\eta_t = v_t+w_t- \phi v_{t-1}$.

We have

\begin{equation} Cov(\eta_t,\eta_t) = (1+\phi^2)\sigma_v^2 + \sigma_w^2, \quad Cov(\eta_t,\eta_{t-1}) = -\phi \sigma_v^2, \quad Cov(\eta_t,\eta_{t-s}) = 0 \quad \forall s \geq 2. \end{equation}

This means that we are seeking to construct an MA(1) process, and therefore we need to select parameters $\theta, \sigma^2$ so that for $\epsilon_t \sim N(0,\sigma^2)$ the combination $\nu_t = \epsilon_t + \theta \epsilon_{t-1}$ had the same autocovariances, i.e. we have to solve the following system:

\begin{equation} \begin{cases} Cov(\nu_t,\nu_t) &= Cov(\eta_t,\eta_t) \\ Cov(\nu_t,\nu_{t-1})&= Cov(\eta_t,\eta_{t-1}) \end{cases} \iff \begin{cases} (1+\theta^2)\sigma^2 &= (1+\phi^2)\sigma_v^2 + \sigma_w^2\\ \theta \sigma^2 &= -\phi \sigma_v^2 \end{cases} \end{equation}

Dividing the first equation by the second and multiplying both sides by $\theta$ we get the following quadratic equation in $\theta$:

$$ 1 + \theta^2 = -\xi \theta,$$

where $\xi:= \phi + \frac{\sigma^2_v+\sigma^2_w}{\phi \sigma^2_v}$.

This equation has two real solutions

$$ \theta = \frac{-\xi \pm \sqrt{\xi^2-4}}{2}$$

Of which only one produces an invertible MA (as $|\xi|>2$ one of the solutions has modulus bigger than 1).

Substituting the calibration you chose into the obtained formula gives a result consistent with the simulations:

$$ \xi = 2.265789473684211, \theta = -0.6004940561846299 $$

edited Nov 20 '19 at 13:24

answered Nov 18 '19 at 14:16

Konstantin

1,331

Thanks. That's an AMAZING find and correction. I haven't gone over what you did but it will really help me get unconfused. Who would think that lecture notes would have mistakes !!!! I tend to not think that but maybe I should start. I will go over your corrections and let you know if I have any questions. Also, could you tell me what section in brockwell and davis discusses this. Thanks again. – mlofton Nov 18 '19 at 15:32
Hey @mlofton, thanks, I can't remember now, but you can go over this post or the post mentioned or search on stackexchange and/or Google. Cheers – Konstantin Nov 18 '19 at 15:54
1

Dear @Konstanti, thank you! I was also very confused about the result. Now situation is clearer for me. – ABK Nov 18 '19 at 16:02
@Konstantin: I'm just looking at this now and I'm confident that it doesn't make a difference in the result but where you did get the negative sign in the covariance of $(\eta_{t},\eta_{t-1})$. thanks. – mlofton Nov 20 '19 at 13:14
@mlofton: Good point! There was a typo in the definition of $\eta_t$, $\nu_{t-1}$ entering with a minus sign. I fixed it now. Thanks! – Konstantin Nov 20 '19 at 13:26
Konstantin: This is really for another thread but it's interesting in that, suppose one sets $\phi = 1$ in the ss rep in order to obtain the standard random walk + noise model. Then the equivalence between the arima(0,1,1) and the random walk + noise model is restricted in the sense that the parameter space of the arima(0,1,1) is larger than that of the random walk + noise model. On the other hand, there is no such restriction in the equivalence between the same state space model ( with $\phi$ not equal to 1 ) and the arma(1,1). – mlofton Nov 20 '19 at 21:28
1

@mlofton, it is not a short discussion, but what if we look at the dependence of the restrictions on $\phi$ and let it go to $0$? If it is interesting, why wouldn't you start a new thread? – ABK Nov 22 '19 at 13:15
@Konstantin: I will try to make a new thread this weekend. Thanks for your interest. Essentially, what happens is that, for the KF formulation, $\phi$ becomes restricted to half of the range that it takes on in the ARIMA(0,1,1) case. – mlofton Nov 23 '19 at 00:36
@mlofton, yeah please do, it would be much more comfortable, 'cause it is difficult to ask additional questions without blowing up the comment thread. The question does look interesting! – Konstantin Nov 23 '19 at 06:33
@Konstantin: I wrote the new one yesterday. I don't know the link off the top of my head but, if you go to users and then go to mlofton, it should be there. thanks for your interest and any comments-insights and no rush. – mlofton Nov 24 '19 at 14:18

Estimation of ARMA from state-space generated data

1 Answers1

Linked