Trend stationary data series

Question

I have a trend stationary data series that does not have a unit root.

The data are hourly with about five years of data.

I have controlled for the apparent trend in the data using a series of binary variables for each year exclusive of one year to avoid singularity.

A reviewer has essentially suggested that I replace the binary variables with a yearly trend term. Question: Won't this give rise to a unit root issue given that a linear trend term has a unit root?

To be honest, I would be more concerned with the fact that dummy indicators per year model that your response changes drastically every January 1st, then stays on a constant level before changing again drastically the next January 1st, with successive step changes completely unrelated to each other. This makes sense in a regulatory framework (e.g., the tax code changing at the beginning of the year), but most other things change much more smoothly. Compare this. So I would agree with your reviewer. Perhaps look at a local smoother, too. — Stephan Kolassa, Feb 08 '23 at 13:44

score 6 · Accepted Answer · answered Feb 08 '23 at 15:26

A linear trend term has no unit root.

A unit root arises in models like $y_t=y_{t-1}+\epsilon_t$, where the characteristic polynomial $1-z=0$ has solution 1. A linear trend model correspondings to something like $$ y_t=\delta t+\epsilon_t $$ In any case, unit roots or not are a property of the underlying process, not your fitting procedure. If your process had a unit root, both fitting a linear trend model as well as dummies would not be the recommended way to go, but rather differencing the series.

I also agree with @Stephan's comment, as what you do seems to amount to what I sketch below - the yearwise mean would (given a positive trend) overstate things in the beginning of the year and understate towards the end of the year.

n <- 24*365*5
delta <- .002
y <- delta*(1:n) + rnorm(n, sd=2)
plot(1:n,y, type="l", lwd=.01)
abline(v=(1:4)24365, lty=2)
year <- rep(1:5,each=24365)
doyend <- 1:524*365
doystart <- c(1, doyend[1:4]+1)
means <- sapply(1:5, function(i) mean(y[year==i]))
segments(doystart, means, doyend, means, lty=1, lwd=4, col="red")
> means
[1]  8.783945 26.244264 43.800528 61.259110 78.873887
> summary(lm(y~factor(year))) # regression-based
Call:
lm(formula = y ~ factor(year))
Residuals:
     Min       1Q   Median       3Q      Max 
-15.9170  -4.4296   0.0005   4.3930  14.8603
Coefficients:
              Estimate Std. Error t value Pr(>|t|)

(Intercept)    8.78395    0.05804   151.3   <2e-16 ***
factor(year)2 17.46032    0.08209   212.7   <2e-16 ***
factor(year)3 35.01658    0.08209   426.6   <2e-16 ***
factor(year)4 52.47516    0.08209   639.3   <2e-16 ***
factor(year)5 70.08994    0.08209   853.9   <2e-16 ***

Trend stationary data series

1 Answers1