I am currently trying to build a mixed effect model using the lme4 package with the in-built lmer() function.
My goal is to predict the number of units sold in a specific site on a specific date based on independent variables such as the retail price or promotions.
Note: I transform the number of units sold in each day into log(units).
At the moment I am reviewing the forecasts of my model looking at the plots. Looking at the generated plots I observed 2 key things:
(1) I had the impression that the sigma of the model is not the correct one.
(2) I had the impression that the model only takes fixed effects into consideration and neglects the random effects which I specify.
Here are the key lines in the code which I assume to be relevant for my questions:
d_data$units_as_log <- log(d_data$units+0.001) # Log Transformation of Units
#Specifying the model formula (very basic model)
model_vec <- c("units_as_log ~ retail_price + (1|article) + (1|site)")
model <- lmer(model_vec[ii],
data=d_data[d_data$date <= endofhistory & d_data$AVAIL == 1,], REML=FALSE)
# <= endofhistory describes the model interval
# > endofisthistory describes the holdout period, with re.form=NULL i try to include random effects
fcst <- exp(predict(model, newdata=subdata[subdata$date > endofhistory,],
re.form = NULL)+summary(model)$sigma^2/2)
runMSE[ii] <- mean((fcst-actuals)^2, na.rm=TRUE)
Question #1: Is the sigma of summary(model) the correct sigma or do I need to calculate it myself somehow?
I previously read that the sigma function of lme4 was moved to the stats package.
Question #2: Is the transformation from log_units to units with + summary(model)$sigma^2/2 correct?
I discovered several posts on StackExchange which gave different answers, hence my confusion here.
Question #3: Do I already include my specified random effects? Do I need to calculate the results myself by building custom matrices?
I previously discovered posts saying lme4 has no predict() function (state of 2014). Nevertheless, I identified that there is a predict() function in the stats package which can deal with lmer objects, including random effects with re.form = NULL.
Regarding Question 1 & 2:
The forecasts (plots) I reviewed looked as if they were too high if I included the (sigma ^2 / 2) and extremely low if I would not include the (sigma ^2 / 2).
Additionally, I was confused since I could not directly call lme4::predict() or lme4::sigma() since I did not know that both these functions are in the {stats} package, which recognizes that the model is in fact a mixed effect model.
– credential May 31 '17 at 09:05Hence my confusion about if random effects are considered in the model.
Please note that I overcame all 3 problems by changing my model from a simple random intercept model to a more complex one including random slopes.
– credential May 31 '17 at 09:12Again your time and help is very much appreciated. Please note that I followed your advice and subscribed to r-sig-mixed-models@r-project.org.
– credential May 31 '17 at 09:52