3

I am forecasting the spending of customer based on the time series data set. I tried to build to models:

  • Arima
  • regression. ´

With Arima model, I was able to forecast quite accurately the spending of customers on normal day. However, in both my data set and present reality, on every first day of month, due to monthly promotion, the customers' spending is always significantly higher than that on other days. Hence, with Arima models, at the moment I have not been able to forecast the customers' spending on first day of month accurately. With regression model, the predicted number for the customers' spending on first day of month was more accurate than that predicted with Arima model.

However, with an $R^2$ of only 50%, the fluctuation of predicted values for others days is very high. Can you please advise me methods that I can use to improve my models or is there any other models that can help me to predict the spending accurately on both normal days and first day of month? Here is a small part of my data enter image description here

  • 1
    use the xreg option of arima to add a binary indicator for the first day of the month. – Harlan Nelson Apr 13 '18 at 03:29
  • Hi Nelson, I have been trying your method and it did work. However, there are still something went wrong which made the results not really perfect. Would you mind advising more for my case? Thank you very much! – Nguyen Ba Viet Apr 16 '18 at 10:21
  • @Harlan Nelson: I did use the xreg option as you suggested and ended up with this coding: library(MASS) library(timeSeries) library(forecast) df <- read.csv("C:/Users/ADMIN_VP/Desktop/ts-tdg.csv", header = TRUE) #file updated daily

    tstdg <- ts(df$TDG) fit <- auto.arima(tstdg, xreg=df$S_F_day_ofmon) fit forecast(fit, xreg=df$S_F_day_ofmon). There are 2 problems here: 1. I only want to forecast the next 5 days but forecast function only works if I put xreg as above

    – Nguyen Ba Viet Apr 17 '18 at 01:54
  • the code seemed to work if my data ended last day of month (which the next forecast will be for first day of month). However, when I deleted some rows to test whether it forecasts well on other days, the first result is still showing like it is forecasting for first day of month (which is much higher than spending on other days). I tried to add seasonal to my model as well but had to reduce all months to 28 days because numbers of days in each month are different (28, 29, 30 and 31). Can you please advise me on how I can improve my model? Many thanks!
  • – Nguyen Ba Viet Apr 17 '18 at 01:57
  • To share the data, use e.g. http://www.sharecsv.com/ – Jim Apr 19 '18 at 18:30