1

I don't understand how time series objects are created in R. I have data: data = c(101,99,97,95,93,91,89,87,85,83,81) (smaller dataset for the sake of brevity). This data is taken once every day for 11 days starting from 2016-07-05 to 2016-07-15. According to the docs, the frequency for data sampled daily should be 7. But I do not understand the values for start and end parameters. For start, docs say: the time of the first observation. Either a single number or a vector of two integers, which specify a natural time unit and a (1-based) number of samples into the time unit. I do not understand what 1-based number of samples means. I tried to google it but it didn't help.

If I just use 2016,7 as the start and end date, I just get:

Time Series:
Start = c(2016, 7) 
End = c(2016, 7) 
Frequency = 7 
[1] 101

If I use 2016,7,1 and 2016,7,11 as the start and end date, I still get the same output.

What am I doing wrong?

Rahul Sharma
  • 873
  • 1
  • 6
  • 16

2 Answers2

1

I think the best way is to switch to xts or zoo, since according to another question here, ts() struggles with daily observations, since the number of days varies between years.

hannes101
  • 2,032
  • 1
  • 14
  • 32
  • I am using time series for forecasting. I tried using `xts`, and it keeps the data in the format I expected (like the timestamp and the value for that timestamp). But the output, that I got from calling `forecast` using the xts object, is a `ts` object which does no longer contain those timestamps. I just see the values. – Rahul Sharma Aug 02 '16 at 08:32
  • The only way of fixing this is by adding the dates manually back to the ts object as described here http://stackoverflow.com/a/10347205/5795592 Don't know if it would be easier to just use a data.frame with a Date column. – hannes101 Aug 03 '16 at 07:28
1

As I understood it, in the ts() function the unit is year. Therefore, here frequency should be set to 365 (days per year). Accordingly, start and end should represent days as well. However, (I believe that) to get the timing right, start and end should be the difference in days of the desired interval from the beginning of the year (in your specific case, 186 and 196 respectively). The appropriateness of these numbers can be checked with:

as.numeric(as.Date("2016-07-05") - as.Date("2016-01-01"))
[1] 186
as.numeric(as.Date("2016-07-15") - as.Date("2016-01-01"))
[1] 196

Embedding these information into your code the call to ts() should be:

data = c(101,99,97,95,93,91,89,87,85,83,81)
ts(data, start = c(2016, 186), end = c(2016, 196), frequency = 365)
# which yielded
Time Series:
Start = c(2016, 186) 
End = c(2016, 196) 
Frequency = 365 
 [1] 101  99  97  95  93  91  89  87  85  83  81

HTH

HelloWorld
  • 677
  • 10
  • 15