What performance indices are best to compare two time series with different data length . Can you suggest a method to do the comparison in R

Question

I have two data sets (observed and simulated). Observed data set is the snow depth observed at a location. The simulated is the model simulated snow depth data. These data sets have different lengths. What is the best method to compare the two data sets in R. I have the data set for hourly data for 30 years. It has several peaks. A sample data is shown here

Does this answer your question? What performance indices are best to compare two time series with different data length . Can you suggest a method to do the comparison in R/Origin — Ryan Volpi, Sep 01 '21 at 16:07

Comte · Accepted Answer · 2021-09-01T16:31:19.057

The first thing I would say is that if you are comparing them it is probably best to have them on the same time scale, and preferably the same length. In your example you seem to have decimals for one time series and not the other, I would personally round these up or down as they it shouldn't make a huge difference (the following link is shows how to wrangle timeseries in R).

Secondly, depending on how much data is missing, you could pad the missing times and use imputation to fill the gaps. Imputation is not straightforward, and is worth taking the time to read up on before conducting, techniques vary depending on the amount of missing data and the structure of the data.

Once you have done all this you can do a granger test causality test, which will tell you if a given time series is good at predicting another. Which I think serves the purpose here. In R there is a library called lmtest which has the following function 'grangertest()', you can follow this tutorial if you use R.

Note. It is unlikely your data is independent over time, so I would recommend avoiding a vanilla t-test.

What performance indices are best to compare two time series with different data length . Can you suggest a method to do the comparison in R

1 Answers1