I'm looking at three models (linear mixed effect) looking at crime. The first looks at total crime so there are ~96000 observations. In the second model, I look at crime as a function of crime type (so the DV is count and I include a categorical variable: property/violent and I have twice the number of observations: 192,000) with covariates (population, pop density, per capita income, education spending), and then in the last model I include an interaction between crime type and covariates (so crime * (population + population density + per capita income..., also 192,000 observations), in order to look at the "effects" of covariates on different levels of crime.
In the first model, I have an RMSE of .077; in the second, I have an RMSE of .17; and in the third model, I have an RMSE of .193. I'd expect that the RMSE would decrease as a function of crime type as the assumption would be that the model would better predict crime; however, that isn't the case. Is it just that the first model is the best predictive model? Also, I was wondering whether AICs are comparable if you have exactly twice the amount of observations; or is there still no way to compare model AICs with different #s of observations?
This is the function I'm using to compute RMSE
RMSE = function(m, o,n){
(sqrt(mean((m - o)^2)))/n
}
The first model is lmer(log(CRIME_TOTAL + 1) ~ cent.log.pop + cent.log.pop.dens + per_capita_income + cent.EXP_STUDENT + white + diff.dem + black + median_gross_rent + ba + prop5.17.pov + asian + hs + officers + no.grad.hs + (year|PLACE_ID) + (1|COUNTY_ID) + (1|STATE), control = lmerControl(optimizer= "nloptwrap", calc.derivs = FALSE), na.action = 'na.omit', REML = FALSE, city)

The second model is lmer(log(COUNT + 1) ~ CRIME*cent.EXP_STUDENT + cent.log.pop + cent.log.pop.dens + per_capita_income + white + diff.dem + black + median_gross_rent + ba + prop5.17.pov + asian + hs + officers + no.grad.hs + (year|PLACE_ID) + (1|COUNTY_ID) + (1|STATE), control = lmerControl(optimizer= "nloptwrap", calc.derivs = FALSE), na.action = 'na.omit', REML = FALSE, city.v.p)

And the third model is lmer(log(COUNT + 1) ~ CRIME*( cent.log.pop + cent.log.pop.dens + white + black + per_capita_income + no.grad.hs + prop5.17.pov + ma.plus + hs + ba + median_gross_rent+cent.EXP_STUDENT +unemployment_rate + asian + diff.dem) + officers + (year|PLACE_ID) + (1|COUNTY_ID) + (1|STATE), control = lmerControl(optimizer= "nloptwrap", calc.derivs = FALSE), na.action = 'na.omit', REML = FALSE, city.v.p)
Thanks!
