1

I have a dataset of temperatures for a few decades. Data looks a lot like on image, with seasonality.

temperature data

The question is whether it is possible to draw conclusions about how the temperature is changing from simple linear regression on such data. If not, what models are preferable to figure out if (and how) the temperature is changing over time?

muffin
  • 111
  • 2

2 Answers2

1

You could try linear regression with three explanatory variables, time, sin(c$\times$time) and cos(c$\times$time) where c is a constant chosen to correctly represent the period of the seasonality. Hopefully then the regression coefficients for the sin and cosine terms will account for the seasonality and that for the linear term would show the long term trend.

Alternatively (and this is what climatologists would probably do) you could just look at the baselined dataset. Pick a baseline period of (say) 30 years and compute the average for each month. Then subtract that average from the values for the corresponding months in the data and plot/analyse that. That will remove a lot of the seasonality in the data.

Dikran Marsupial
  • 54,432
  • 9
  • 139
  • 204
0

Since you said that you want to get conclusions about how temperature is changing. In case if you don't have any other predictor variables. Then what you can do is to use embed function in R. For example: vec = seq(1,90,1)

plot(vec,sin(vec),type = "l")

df = data.frame(vec, sin(vec))

The above code generates a similar line plot with seasonality for every 6 data points.

sin_embed = embed(df[,2],8)

the function embed creates a lag values the vector. it takes the vector and numerical attribute to specify how many lag values shud be generated.the last column is the original vector and remaining are the lag variables. Since the seasonality is for every 6 data points i select the 2 and 8 columns of sin_embed and use the 2 column as predictor and 8 column as response.

you can do a similar thing for your dataset.Prediction wise it does extremely good.