7

Can anybody tell me something about the difference between the functions VAR() and dynlm()? I thought I could do my VAR with OLS also with the lm() function, but found out that I can not include lags. This problem is solved by using dynlm(). I tried VAR() and dynlm() with the same data and also changed the type of the VAR function ("const" / "trend" /"none" / "both"), but in none of the cases the results were the same. Does anybody know the difference?

Data Example (only 100 rows of my time series with over 1000 rows):
  Price Open_Interest_All

1992-10-06 -0.781222672 1.63119379 1992-10-13 -0.226928010 4.05849973 1992-10-20 -0.314059754 0.74296670 1992-10-27 0.427535236 4.69469707 1992-11-03 -0.066975848 -8.97727509 1992-11-10 -1.105733849 13.60848986 1992-11-17 -0.033504156 -0.86606938 1992-11-24 0.132090521 3.09942608 1992-12-01 0.451677760 -6.96875467 1992-12-08 0.059010297 1.45547915 1992-12-15 0.018143072 2.47922886 1992-12-22 0.415489498 0.11104701 1992-12-29 0.281748150 -0.01741099 1993-01-05 1.264093684 0.75242942 1993-01-12 -1.765301997 -0.16864140 1993-01-19 0.352909608 1.23869888 1993-01-26 0.152561942 5.16469475 1993-02-02 -0.306023610 -1.88288581 1993-02-09 0.371154802 -0.22569400 1993-02-16 -0.096699431 0.72290467 1993-02-23 -0.216939288 -1.03640733 1993-03-02 -0.692525272 7.32922042 1993-03-09 0.148512407 -0.83248916 1993-03-16 0.053258877 -9.34412387 1993-03-23 0.282899028 -2.27865356 1993-03-30 -0.348375291 -0.68304626 1993-04-06 -0.701583175 5.13726283 1993-04-13 -0.643717452 7.49539668 1993-04-20 -2.315215806 6.78579388 1993-04-27 0.728123084 -7.89513692 1993-05-04 -0.401917826 1.93334817 1993-05-11 -0.820260275 -0.41889123 1993-05-18 -0.136744342 0.17780875 1993-05-25 0.851391229 5.73901054 1993-06-01 -1.030441814 2.02329734 1993-06-08 0.622336509 -0.47102737 1993-06-15 0.188830476 1.12995553 1993-06-22 0.425932387 3.25510238 1993-06-29 0.094860323 -1.15090784 1993-07-06 0.690949433 -3.84819577 1993-07-13 -0.555562379 -0.79792963 1993-07-20 -0.003165141 0.26002615 1993-07-27 0.303670755 -2.04747741 1993-08-03 0.173833160 0.14447390 1993-08-10 -0.361775029 4.78484100 1993-08-17 0.384427420 -3.66029272 1993-08-24 0.427606006 0.68034298 1993-08-31 -0.138889246 -7.61964095 1993-09-07 0.336480002 -2.54707323 1993-09-14 -1.121502023 5.46051062 1993-09-21 -1.763129531 13.11181789 1993-09-28 0.039080472 0.86031698 1993-10-05 -0.966905912 -4.20461016 1993-10-12 0.218149112 -2.12196119 1993-10-19 -0.817582360 -1.80785218 1993-10-26 -0.077143308 4.02653854 1993-11-02 0.079626855 0.64359810 1993-11-09 0.451802406 6.08728086 1993-11-16 -0.305816356 4.74542449 1993-11-23 0.217000320 4.48037134 1993-11-30 -0.188669087 -1.17994159 1993-12-07 0.747784520 5.22816662 1993-12-14 0.636011468 -2.22521990 1993-12-21 0.373247064 -0.52307968 1993-12-28 0.046941753 -0.95758768 1994-01-04 -0.810928287 -1.05574558 1994-01-11 0.490505014 -2.23947004 1994-01-18 0.699126287 1.47016108 1994-01-25 0.420702924 -5.33040334 1994-02-01 -0.034364270 1.03293264 1994-02-08 0.178692037 7.82838866 1994-02-15 -0.763248370 -4.77697912 1994-02-22 0.834492829 -4.39362728 1994-03-01 -0.319921226 -5.77740405 1994-03-08 0.428953703 6.08511360 1994-03-15 0.293429418 3.33746465 1994-03-22 0.212134960 8.28017661 1994-03-29 0.015434084 -3.84063215 1994-04-05 -0.819101724 -12.37463070 1994-04-12 -0.241797314 -6.35085378 1994-04-19 -0.110199011 -0.52875672 1994-04-26 0.292877781 8.25096506 1994-05-03 0.769091229 8.93992009 1994-05-10 1.335381796 -3.14410505 1994-05-17 1.218414811 -6.21162283 1994-05-24 0.915342249 -1.06778949 1994-05-31 -0.899495962 3.65817336 1994-06-07 -0.033983195 -4.82433827 1994-06-14 1.561915814 4.23026763 1994-06-21 -0.066335053 -1.33347133 1994-06-28 -0.385756193 -5.93684313 1994-07-05 0.991119380 -5.67886504 1994-07-12 -0.398792460 -8.50476149 1994-07-19 0.232917128 6.45658144 1994-07-26 0.382266943 -9.35866297 1994-08-02 -1.190583357 -0.50190110 1994-08-09 0.364675303 -5.42680501 1994-08-16 -0.070737617 1.57683454 1994-08-23 -0.167435051 -3.93086516 1994-08-30 0.928324476 5.51263721

VARexample <- VAR(example, p=5, type="const") lmexample <- lm(example$Price[6:100]~example$Price[5:99]+example$Price[4:98]+example$Price[3:97]+example$Price[2:96]+example$Price[1:95]+example$Open_Interest_All[5:99]+example$Open_Interest_All[4:98]+example$Open_Interest_All[3:97] + example$Open_Interest_All[2:96] + example$Open_Interest_All[1:95]) dynlmexample <- dynlm(example$Price~L(example$Price,1:5) + L(example$Open_Interest_All,1:5))

VARcoeffsexample <- as.data.frame(coef(VARexample)$Price[c(11,1,3,5,7,9,2,4,6,8,10),1]) lmcoeffsexample <- as.data.frame(coef(lmexample)) dynlmcoeffsexample <- as.data.frame(coef(dynlmexample))

compare <- data.frame(VARcoeffsexample[,1], lmcoeffsexample[,1], dynlmcoeffsexample[,1]) rownames(compare) <- rownames(VARcoeffsexample)

Output: > compare VARcoeffsexample...1. lmcoeffsexample...1. dynlmcoeffsexample...1. const 0.0162815782 0.0162815782 0.016408246 Price.l1 -0.0233716524 -0.0233716524 -0.062759716 Price.l2 0.1262282115 0.1262282115 0.050038501 Price.l3 0.1444267851 0.1444267851 0.103449318 Price.l4 -0.0806846153 -0.0806846153 0.005414185 Price.l5 -0.0197842734 -0.0197842734 -0.048802520 Open_Interest_All.l1 -0.0176241595 -0.0176241595 -0.022186654 Open_Interest_All.l2 0.0166671079 0.0166671079 -0.008759684 Open_Interest_All.l3 0.0205802095 0.0205802095 -0.007958906 Open_Interest_All.l4 -0.0384040280 -0.0384040280 -0.003796186 Open_Interest_All.l5 0.0006342804 0.0006342804 -0.029456337

So lm() and VAR() are the same coefficients, but dynlm() is different...

Anna
  • 119
  • 3
    If you make lags manually (e.g. using the function embed), you can use them in lm. (This does not answer your question, so I am only posting this as a comment.) – Richard Hardy Mar 01 '21 at 08:51

1 Answers1

2

Without seeing your code, it is hard to spell out the difference in results. But it sure is possible to get the same results in either package, as - as you correctly point out - all three commands ultimately just run OLS regressions.

It is with different degrees of ease, though, reflecting the purpose of the packages. lm is, of course, for all sorts of regressions, while the other two explicitly have time series regressions in mind, and vars even multivariate ones.

Here is an example.

library(dynlm)
library(vars)

x <- ts(rnorm(100)) # ts is relevant for dynlm, see discussion in comments below! y <- ts(rnorm(100))

at a glance

all.equal(c(coef(dynlm(x ~ L(x, 1:3) + L(y, 1:3))), coef(dynlm(y ~ L(x, 1:3) + L(y, 1:3)))), c(coef(VAR(cbind(x,y), p = 3, type = "const"))$x[c(7,1,3,5,2,4,6),1], coef(VAR(cbind(x,y), p = 3, type = "const"))$y[c(7,1,3,5,2,4,6),1]), c(coef(lm(x[4:100]~x[3:99]+x[2:98]+x[1:97]+y[3:99]+y[2:98]+y[1:97])), coef(lm(y[4:100]~x[3:99]+x[2:98]+x[1:97]+y[3:99]+y[2:98]+y[1:97]))), check.attributes=F)

dynlm(x ~ L(x, 1:3) + L(y, 1:3)) dynlm(y ~ L(x, 1:3) + L(y, 1:3))

VAR(cbind(x,y), p = 3, type = "const")

lm(x[4:100]~x[3:99]+x[2:98]+x[1:97]+y[3:99]+y[2:98]+y[1:97]) lm(y[4:100]~x[3:99]+x[2:98]+x[1:97]+y[3:99]+y[2:98]+y[1:97])

Output:

> dynlm(x ~ L(x, 1:3) + L(y, 1:3))

Time series regression with "ts" data: Start = 4, End = 100

Call: dynlm(formula = x ~ L(x, 1:3) + L(y, 1:3))

Coefficients: (Intercept) L(x, 1:3)1 L(x, 1:3)2 L(x, 1:3)3 L(y, 1:3)1 L(y, 1:3)2 L(y, 1:3)3
-0.14797 -0.13608 0.04310 -0.14119 0.03736 -0.20556 -0.07980

> dynlm(y ~ L(x, 1:3) + L(y, 1:3))

Time series regression with "ts" data: Start = 4, End = 100

Call: dynlm(formula = y ~ L(x, 1:3) + L(y, 1:3))

Coefficients: (Intercept) L(x, 1:3)1 L(x, 1:3)2 L(x, 1:3)3 L(y, 1:3)1 L(y, 1:3)2 L(y, 1:3)3
0.001093 0.008268 0.101429 -0.122984 0.039118 0.060185 -0.194614

> VAR(cbind(x,y), p = 3, type = "const")

VAR Estimation Results:

Estimated coefficients for equation x:

Call: x = x.l1 + y.l1 + x.l2 + y.l2 + x.l3 + y.l3 + const

   x.l1        y.l1        x.l2        y.l2        x.l3        y.l3       const 

-0.13608446 0.03735653 0.04310129 -0.20555950 -0.14119156 -0.07980048 -0.14797419

Estimated coefficients for equation y:

Call: y = x.l1 + y.l1 + x.l2 + y.l2 + x.l3 + y.l3 + const

    x.l1         y.l1         x.l2         y.l2         x.l3         y.l3        const 

0.008267836 0.039117666 0.101428691 0.060184617 -0.122984226 -0.194613595 0.001093310

> lm(x[4:100]~x[3:99]+x[2:98]+x[1:97]+y[3:99]+y[2:98]+y[1:97])

Call: lm(formula = x[4:100] ~ x[3:99] + x[2:98] + x[1:97] + y[3:99] + y[2:98] + y[1:97])

Coefficients: (Intercept) x[3:99] x[2:98] x[1:97] y[3:99] y[2:98] y[1:97]
-0.14797 -0.13608 0.04310 -0.14119 0.03736 -0.20556 -0.07980

> lm(y[4:100]~x[3:99]+x[2:98]+x[1:97]+y[3:99]+y[2:98]+y[1:97])

Call: lm(formula = y[4:100] ~ x[3:99] + x[2:98] + x[1:97] + y[3:99] + y[2:98] + y[1:97])

Coefficients: (Intercept) x[3:99] x[2:98] x[1:97] y[3:99] y[2:98] y[1:97]
0.001093 0.008268 0.101429 -0.122984 0.039118 0.060185 -0.194614

  • I added the data. You helped me with the lm() function and I now get the same results for VAR() and lm(), but not for dynlm()... – Anna Mar 02 '21 at 10:07
  • It will be related to you not declaring the variables as ts objects. Something like this works for me: `library(vars) library(dynlm)

    my_data <- read.table(file = "clipboard", sep = "\t", header=F, dec=",") example <- data.frame(my_data) example$Price <- ts(example$V1) example$Open_Interest_All <- ts(example$V2) example$V1 <- NULL example$V2 <- NULL

    VARexample <- VAR(example, p=5, type="const") dynlmexample <- dynlm(example$Price~L(example$Price,1:5) + L(example$Open_Interest_All,1:5))`

    – Christoph Hanck Mar 02 '21 at 10:26
  • Needed to create two separate variables ('x <- xts_to_ts(example$Price)' & 'y <- xts_to_ts(example$Open_Interest_All)' ) and then it worked and all three functions created the same coefficients. Interesting, that the function lm() doesn't know we have a ts when str(example) is a xts object. Thanks for the help! – Anna Mar 02 '21 at 11:03
  • @ChristophHanck, this is off topic here, but I have been struggling with some time series questions that you might be able to answer. So if you find the time and interest, here is the last one, and some other ones are linked in it. Thank you! – Richard Hardy Mar 04 '21 at 13:59