The following multilevel logistic model with one explanatory variable at level 1 (individual level) and one explanatory variable at level 2 (group level) :
$$\text{logit}(p_{ij})=\pi_{0j}+\pi_{1j}x_{ij}\ldots (1)$$ $$\pi_{0j}=\gamma_{00}+\gamma_{01}z_j+u_{0j}\ldots (2)$$ $$\pi_{1j}=\gamma_{10}+\gamma_{11}z_j+u_{1j}\ldots (3)$$
where , the group-level residuals $u_{0j}$ and $u_{1j}$ are assumed to have a multivariate normal distribution with expectation zero . The variance of the residual errors $u_{0j}$ is specified as $\sigma^2_0$ , and the variance of the residual errors $u_{1j}$ is specified as $\sigma^2_1$ .
I want to estimate the parameter of the model and I like to use R command glmmPQL .
Substituting equation (2) and (3) in equation (1) yields ,
$$\text{logit}(p_{ij})=\gamma_{00}+\gamma_{10}x_{ij}+\gamma_{01}z_j+\gamma_{11}x_{ij}z_j+u_{0j}+u_{1j}x_{ij}\ldots (4)$$
There are 30 groups$(j=1,...,30)$ and 5 individual in each group .
R code :
#Simulating data from multilevel logistic distribution
library(mvtnorm)
set.seed(1234)
J <- 30 ## number of groups
n_j <- rep(5,J) ## number of individuals in jth group
N <- sum(n_j)
g_00 <- -1
g_01 <- 0.3
g_10 <- 0.3
g_11 <- 0.3
s2_0 <- 0.13 ##variance corresponding to specific ICC
s2_1 <- 1 ##variance standardized to 1
s01 <- 0 ##covariance assumed zero
z <- rnorm(J)
x <- rnorm(N)
#Generate (u_0j,u_1j) from a bivariate normal .
mu <- c(0,0)
sig <- matrix(c(s2_0,s01,s01,s2_1),ncol=2)
u <- rmvnorm(J,mean=mu,sigma=sig,method="chol")
pi_0 <- g_00 +g_01*z + as.vector(u[,1])
pi_1 <- g_10 + g_11*z + as.vector(u[,2])
eta <- rep(pi_0,n_j)+rep(pi_1,n_j)*x
p <- exp(eta)/(1+exp(eta))
y <- rbinom(N,1,p)
Now the parameter estimation .
#### estimating parameters
library(MASS)
library(nlme)
sim_data_mat <- matrix(c(y,x,rep(z,n_j),rep(1:30,n_j)),ncol=4)
sim_data <- data.frame(sim_data_mat)
colnames(sim_data) <- c("Y","X","Z","cluster")
summary(glmmPQL(Y~X*Z,random=~1|cluster,family=binomial,data=sim_data,,niter=200))
OUTPUT :
iteration 1
Linear mixed-effects model fit by maximum likelihood
Data: sim_data
Random effects:
Formula: ~1 | cluster
(Intercept) Residual
StdDev: 0.0001541031 0.9982503
Variance function:
Structure: fixed weights
Formula: ~invwt
Fixed effects: Y ~ X * Z
Value Std.Error DF t-value p-value
(Intercept) -0.8968692 0.2018882 118 -4.442404 0.0000
X 0.5803201 0.2216070 118 2.618691 0.0100
Z 0.2535626 0.2258860 28 1.122525 0.2712
X:Z 0.3375088 0.2691334 118 1.254057 0.2123
Correlation:
(Intr) X Z
X -0.072
Z 0.315 0.157
X:Z 0.095 0.489 0.269
Number of Observations: 150
Number of Groups: 30
Why does it take only $1$ iteration while I mentioned to take $200$ iterations inside the function
glmmPQLby the argumentniter=200?Also p-value of group-level variable $(Z)$ and cross-level interaction $(X:Z)$ shows they are not significant . Still why in this article, they keep the group-level variable $(Z)$ and cross-level interaction $(X:Z)$ for further analysis ?
Also How are the degrees of freedom
DFbeing calculated ?It doesn't match with the relative bias of the various estimates of the table . I tried to calculate the relative bias as :
#Estimated Fixed Effect parameters : hat_g_00 <- -0.8968692 #overall intercept hat_g_10 <- 0.5803201 # X hat_g_01 <-0.2535626 # Z hat_g_11 <-0.3375088 #X*Z fixed <-c(g_00,g_10,g_01,g_11) hat_fixed <-c(hat_g_00,hat_g_10,hat_g_01,hat_g_11) #Estimated Random Effect parameters : hat_s_0 <-0.0001541031 ##Estimated Standard deviation of random intercept hat_s_1 <- 0.9982503 std <- c(sqrt(0.13),1) hat_std <- c(0.0001541031,0.9982503) ##Relative bias of Fixed Effect : rel_bias_fixed <- ((hat_fixed-fixed)/fixed)*100 [1] -10.31308 93.44003 -15.47913 12.50293 ##Relative bias of Random Effect : rel_bias_Random <- ((hat_std-std)/std)*100 [1] -99.95726 -0.17497- Why doesn't the relative bias match with the table ?
I need to run a large number of simulations and compute averages. Does it mean , say , I have to simulate data $300$ times from multilevel logistic distribution and estimate their parameters each time and to take the average of the estimates ? But If I do say , won't the value of estimated parameter become equal to true value of parameter according to $\mathbb E[\hat\theta]=\theta$ ? – ABC Jun 15 '15 at 00:09glmer()estimates the variance of random intercept , $\sigma_0^2$ . But I am not getting any estimate of other variance component (residual variance component) , $\sigma_1^2$ from the result produces bysummary(glmer(Y~X*Z+(1|cluster),family=binomial,data=sim_data,nAGQ=10))– ABC Jun 15 '15 at 00:53Y~X*Z+(1|cluster), i.e.,glmer(Y~X*Z+(1|cluster),family=binomial,data=sim_data). Is it correct ? It seems to me glmer model for equation (4) will beY~X*Z+(X|cluster), i.e.,glmer(Y~X*Z+(X|cluster),family=binomial,data=sim_data). Is the latter glmer model correct ? – ABC Jul 20 '15 at 07:02Y~X*Z+(X|cluster)fit the model in equation (4) best , I triedglmer(Y~X*Z+(X|cluster),family=binomial,data=sim_data,nAGQ=10). But It showsError in updateGlmerDevfun(devfun, glmod$reTrms, nAGQ = nAGQ) : nAGQ > 1 is only available for models with a single, scalar random-effects term. I have read the link https://github.com/lme4/lme4/issues/123 . But there is no , lme4.0 package for R version 3.2.1 and I am using lme4 1.1-8 version . – ABC Jul 22 '15 at 02:27