0

I am just starting anew in econometrics, and am presently trying out statistical arbitrage, specifically cointegration. I have stochastic time series data for variables A, B. Engle-Granger testing is used for the project. What I have done:

  1. Performed a stationarity test.
  2. Performed differencing
  3. Performed OLS regression
  4. Found the trading pair formula by calculating the spread. Which is spread = dependent (A) - regressor (B).

Now, where I got confused is how to get the mean and standard deviation of the spread. I'm a bit confused on how to calculate these, and I'll really appreciate it if I can get a help in this regard.

Please see the attached.csv file and simply enter your response when the file is opened in Excel.

Scan QR code to see the csv file

or go directly to it - https://fileport.io/gp9SsSBcaSwq

ken4ward
  • 103
  • 1
    If you use Engle-Granger and find that A and B are cointegrated, why do you apply differencing? – Richard Hardy Jan 31 '23 at 18:21
  • Engel Granger dictates that the variables must be stationary at first difference. – ken4ward Jan 31 '23 at 22:33
  • It might be what a unit root test shows, but not what Engle-Granger dictates. Engle-Granger is about cointegration (a relationship between multiple integrated series), not about integration (a property of a single series). To find out about cointegration using the Engle-Granger procedure, you should not be differencing. Instead, you should run a regression of A on B as they are. If there is cointegration, the residual will be stationary. It will be the estimate of S. – Richard Hardy Feb 01 '23 at 07:36
  • @Richard Hardy, You're right. The differencing is done to check if the raw variables are stationary at first difference. Engel Granger test was done non the raw data. I have attached a CSV file to the main post. Could you add the formula to calculate mean and standard deviation in Excel, and reshare with me? Thank you. – ken4ward Feb 01 '23 at 18:19

1 Answers1

1

The stationary combination $S$ of a pair of cointegrating time series $(A,B)$ can be treated as any stationary time series. If the cointegrating vector is $(1,-1)$, then $S:=A-B$ is stationary.

  • Its (unconditional) mean can be estimated as the sample mean, just as you would do with i.i.d. data.
  • The (unconditional) standard deviation can again be estimated as the sample standard deviation; see e.g. this thread.
  • The standard error of the mean can be estimated using autocorrelation-robust standard errors such as Newey-West.

Here is some R code:

T=1e3
set.seed(1); x=arima.sim(model=list(ar1=0.9,ma1=-0.3),n=T)
mean=mean(x); print(mean)
sd  =sd  (x); print(sd  )
se  =sqrt(sandwich::NeweyWest(lm(x~1))); print(se)
Richard Hardy
  • 67,272
  • I'm not using R for now, could you add the formula to calculate mean and standard deviation in Excel, and reshare with me? Thank you. – ken4ward Feb 01 '23 at 18:20
  • 1
    @ken4ward, Excel formulas are =AVERAGE(...) and =STDEV.S(...) where ... is the range of the cells the time series occupies. – Richard Hardy Feb 01 '23 at 20:08