0

I am using the R. I am trying to use the "lines' command in ggplot2 to show the predicted values vs. the actual values for a statistical model (arima, time series). Yet, when I ran the code, I can only see a line of one color.

I simulated some data in R and then tried to make plots that show actual vs predicted:

#set seed
set.seed(123)

#load libraries
library(xts)
library(stats)


#create data

date_decision_made = seq(as.Date("2014/1/1"), as.Date("2016/1/1"),by="day")

date_decision_made <- format(as.Date(date_decision_made), "%Y/%m/%d")

property_damages_in_dollars <- rnorm(731,100,10)

final_data <- data.frame(date_decision_made, property_damages_in_dollars)


#aggregate
y.mon<-aggregate(property_damages_in_dollars~format(as.Date(date_decision_made),
                                                    format="%W-%y"),data=final_data, FUN=sum)

y.mon$week = y.mon$`format(as.Date(date_decision_made), format = "%W-%y")`

ts = ts(y.mon$property_damages_in_dollars, start = c(2014,1), frequency = 12)

#statistical model
fit = arima(ts, order = c(4, 1, 1))

Here were my attempts at plotting the graphs:

#first attempt at plotting (no second line?)
 plot(fit$residuals, col="red")
 lines(fitted(fit),col="blue")

#second attempt at plotting (no second line?)

par(mfrow = c(2,1),
    oma = c(0,0,0,0), 
    mar = c(2,4,1,1))
plot(ts,  main="as-is") # plot original sim
lines(fitted(fit), col = "red") # plot fitted values
legend("topleft", legend = c("original","fitted"), col = c("black","red"),lty = 1)

#third attempt (plot actual, predicted and 5 future values - here, the actual and future values show up, but not the predicted)

pred = predict(fit, n.ahead = 5)
ts.plot(ts, pred$pred, lty = c(1,3), col=c(5,2))

However, none of these seem to be working correctly. Could someone please tell me what I am doing wrong? (note: the computer I am using for my work does not have an internet connection or a usb port - it only has R with some preloaded packages. I do not have access to the forecast package.)

Thanks


Sources:

Konrad Rudolph
  • 506,650
  • 124
  • 909
  • 1,183
stats_noob
  • 3,127
  • 2
  • 8
  • 27

1 Answers1

1

You seem to be confusing a couple of things:

  1. fitted usually does not work on an object of class arima. Usually, you can load the forecast package first and then use fitted. But since you do not have acces to the forecast package you cannot use fitted(fit): it always returns NULL. I had problems with fitted before.

  2. You want to compare the actual series (x) to the fitted series (y), yet in your first attempt you work with the residuals (e = x - y)

  3. You say you are using ggplot2 but actually you are not

So here is a small example on how to plot the actual series and the fitted series without ggplot.

set.seed(1)

x <- cumsum(rnorm(10))
y <- stats::arima(x, order = c(1, 0, 0))

plot(x, col = "red", type = "l")
lines(x - y$residuals, col = "blue")

I Hope this answer helps you get back on tracks.

enter image description here

Cettt
  • 10,939
  • 7
  • 31
  • 53
  • great answer! can you please clarify "option 2" in the original post? why is there only "one color" line? par(mfrow = c(2,1), oma = c(0,0,0,0), mar = c(2,4,1,1)) plot(ts, main="as-is") # plot original sim lines(fitted(fit), col = "red") # plot fitted values legend("topleft", legend = c("original","fitted"), col = c("black","red"),lty = 1) –  Dec 16 '20 at 16:30
  • the second method does not work because `fitted(fit)` returns `NULL`. This is mentioned in the answer.+ – Cettt Dec 16 '20 at 16:45
  • @Cett : is there anyway to get around this and still use the "fitted" option? – stats_noob Dec 17 '20 at 01:36
  • @stats555 if you want to work with an object of class `arima` you cannot use `fitted`. However, you can define your own version of `fitted` called `fitted.arima` by copying the source code of `forecast:::fitted.Arima`. If you do so, you will see that `fitted` of an Arima object is defined as I defined it in my answer. – Cettt Dec 17 '20 at 09:01