getting Significant seasonality on a straight line

Question

I'm running a Significant seasonality test on my data with isSeasonal {seastests} function. I'm not sure why I'm getting a TRUE answer for the following data.

library(ggplot2)
library(seastests)
pop_ts = ts(economics$pop,frequency = 12, start = c(1967,7))
isSeasonal(pop_ts)

[1] TRUE

It looks like the population of the United States (in thousands) which is increasing overall except that there are seasonal rhythms to births, deaths and migration. — Nick Cox, May 03 '20 at 15:17
For people without the data from ggplot2, the data comes from https://fred.stlouisfed.org/series/POP — Sextus Empiricus, May 04 '20 at 10:35

score 3 · Answer 1 · answered May 03 '20 at 13:04

The data is seasonal by the looks of things.

I just read the data in and plotted the residuals of a linear model and there is definitely some bump every year (seen most clearly between years 30 and 40). I'm guessing there's some very small seasonal effect on top of the predominantly linear trajectory that you're just not seeing by eyeballing the data. Not sure how small a seasonal effect has to be in order to be detected by that command though.

IrishStat · Answer 2 · 2020-05-04T06:46:07.257

after receiving the data from the dropbox , I have some interesting things to report using AUTOBOX , a time series analusis package that I have helped to develop.

To some it would appear that differencing is the remedy for the non-stationarity BUT not too all ...Is a time trend a substitute for first differencing? discusses the need for incorporating deterministic time trends ...as needed in this example.

It definitely has arima seasonal structure ..... and some short term arima structure AND 5 different trend point changes . AND non-constant error variance requiring weighted least squares following http://docplayer.net/12080848-Outliers-level-shifts-and-variance-changes-in-time-series.html

My 81 year old eye failed to identify that there were 5 trends .... In addition there were seasonal parameters and an error variance change breakpoint . following http://docplayer.net/12080848-Outliers-level-shifts-and-variance-changes-in-time-series.html

The forecast plot is here

The residual ACF suggesting sufficiency is here

Sextus Empiricus · Answer 3 · 2020-05-04T11:24:43.857

The answer from IrishStat shows all the points in the outcomes from the tests. But to my young eye (still an apprentice in time series), these tables with results from tests are not so easy to interpret.

So I made simply a plot of the monthly difference

# plot
plot(diff(economics$pop), type = "l",
     xaxt="n", yaxt = "n", xlab = "", ylim = c(100,350),
     main = "monthly population growth/change", ylab = "net change/month \n per thousand", lwd = 2)
points(diff(economics$pop), pch = 21, col = 1, bg = 1 ,cex = 0.4)

# custom axes
axis(2, at = seq(0,350,50),las = 2)
axis(1,  at = c(0:(2015-1968))*12+7,
     labels = rep("",length(c(0:(2015-1968)))), las = 1)
axis(1,  at = c(1:(length(economics$pop)-1)), tck = -0.02,
     labels = rep("",length(economics$pop)-1), las = 2)
axis(1,  at = c(-1:(2015-1968))*12+7+6, tck = 0,
     labels = c(1967:2015), las = 1,cex.axis=0.8, line = -0.5, lwd=0,las =2)

# lines for the yearly periods and the changes every 10 years 
# (when the estimation method changes base on a new census)
for (i in 0:4) {
  lines(rep(i*120+9,2)+12*2,c(-100,+500),lty=2)
}
for (i in 0:(2015-1968)) {
  lines(rep(i*12+7,2),c(-100,+500),lty=2, col = 8)
}

The plot of the monthly growth clearly shows the periodicity every 12 months and every 10 years. We can also have a reasonable guess why these periods occur. The data are estimates and not raw observations. Every 10 years the estimation method is adjusted based on a new census, and this causes those jumps every 10 years. The 12 monthly pattern is likely due to the death rate which is higher in the winter (which makes the net growth smaller).

When I took a look into the description of the isSeasonal function it seems to me like it is basically fitting an ARIMA model with a low order (to subtract the linear trend) and then looking at the autocorrelation function of the residuals to see if there is a seasonal component.

### ARIMA model (with an order smaller than the freqeuncy to be tested)
mod <- forecast::auto.arima(economics$pop,max.order = 3)

#plotting ARIMA model with data
layout(c(1:2), heights = c(2,1))
window <- c(1,7)
plot(economics$pop, xlim = window*12,ylim = range(economics$pop[window*12]*c(0.99,1.01)),
     xaxt = "n", ylab = "population \n per thousand", main = "observation/estimates + ARIMA fit")
axis(1,  at = c(0:(2015-1968))*12+7,
     labels = rep("",length(c(0:(2015-1968)))), las = 1)
axis(1,  at = c(1:(length(economics$pop)-1)), tck = -0.01,
     labels = rep("",length(economics$pop)-1), las = 2)
axis(1,  at = c(-1:(2015-1968))*12+7+6, tck = 0,
     labels = c(1967:2015), las = 1,cex.axis=1, line = -0.5, lwd=0,las =1)
lines(mod$fitted)

#plotting residuals
plot(mod$residuals,  xlim = window*12,ylim = c(-50,50),
     xaxt = "n", ylab = "difference \n per thousand", main = "residuals observation-fit")
axis(1,  at = c(0:(2015-1968))*12+7,
     labels = rep("",length(c(0:(2015-1968)))), las = 1)
axis(1,  at = c(1:(length(economics$pop)-1)), tck = -0.01,
     labels = rep("",length(economics$pop)-1), las = 2)
axis(1,  at = c(-1:(2015-1968))*12+7+6, tck = 0,
     labels = c(1967:2015), las = 1,cex.axis=1, line = -0.5, lwd=0,las =1)
for (i in 0:(2015-1968)) {
  lines(rep(i*12+7,2),c(-100,+500),lty=2, col = 8)
}

The difference between the ARIMA fit and the time series is very difficult to observe with the naked eye, but a plot of the residuals shows it well.

The autocorrelation function of the residuals acf(mod$residuals) shows the seasonality in the residuals, indicating that there is a yearly pattern. This is obviously an anomaly to the naked eye. The isSeasonal function looks at it by performing all kinds of test to find out whether the anomaly is significant or not.

getting Significant seasonality on a straight line

3 Answers3