3

Why does SAS random and repeated both produce the same result?

Can someone explain this in detail?

For example:

proc mixed data=test;
class variable1 ..... variableN;
model outcome=variable1+...+variableN;
random intercept/ subject=cluster type=cs;
run;

proc mixed data=test;
class variable1 ..... variableN;
model outcome=variable1+...+variableN;
repeated/ subject=cluster type=cs;
run;

Why do both program produce the same result?

Momo
  • 9,323
gyambqt
  • 61
  • 2
    Descriptions of the input and the output are necessary in order for this question to fit within our format. Please see the [help] for more information. – whuber Jul 15 '14 at 00:08

2 Answers2

15

The mixed effects model can be written as $Y=X\beta+Zu+\epsilon$, where $X$ and $Z$ are matrices of known constants, $\beta$ is an unknown parameter vector, $u$ is a random vector, and $\epsilon$ is a vector of random errors, all of which are appropriately conformable. The elements of $\beta$ are considered to be non-random "fixed effects" and the elements of $u$ are "random effects." $Var(\epsilon)=R$ and $Var(u)=G$. Then the variance of $Y$, $Var(Y)=ZGZ^\prime+R$ as we assume $Cov(\epsilon, u)=0$.

SAS differentiates between "G-side" and "R-side" random effects in your model. G-side effects are random effects that enter through the $G$ matrix above and R-side effects enter through the $R$ matrix above. When you use the repeated statement in proc mixed with a compound symmetric covariance structure as you have specified in your post, you are fitting a marginal model ($G=0$) and placing all the random effects on $R$, which involves the variances and covariances for the error components in $\epsilon$. In this case, you have told SAS that $G=0$ and $R$ is compound symmetric, where each element of $R$ is equal to $\sigma_{ij} = \sigma_1^2+\sigma^2$ for ($i=j$) and $\sigma_{ij} = \sigma_1^2$ for ($i\ne j$). When you use the random intercept statement in proc mixed, you are inducing correlations through both the $G$ and $R$ matrices. In this case you are setting the elements of $G$ to $\sigma_1^2$, and you are setting $R=\sigma^2I$, so that $R$ is the identity matrix with $\sigma^2$ terms on the diagonal.

Therefore, when you use the repeated statement in SAS:

$Var(Y)=ZGZ^\prime+R = 0 + R = 0 + CS = CS$,

where $CS$ is the compound symmetric matrix. When you use the random statement, you get:

$Var(Y)=ZGZ^\prime+R = \sigma_1^2ZZ^\prime + \sigma^2 I = CS$,

since $ZZ^\prime$ is a block diagonal matrix with 1's on the diagonal blocks. As you can see, in either case the end result is the same: a compound symmetric covariance matrix.

I hope this helps. Best of luck!

StatsStudent
  • 11,444
  • Yes, the random intercept agrees with CS, but only for non-negative correlation. The two formulations differ in parameter space. Random effect won't allow for negative correlations (variance components cannot be negative) and will zero them, making the model biased, while CS will allow for negative correlations. – Bastian Sep 15 '22 at 23:46
  • Temporary post to be deleted later: StatsStudent and @Bastian can you also answer my question? The main concerns are summarized in the very beginning. https://stats.stackexchange.com/questions/636596/multilevel-mixed-linear-regression-with-pseudo-repeats-why-designate-repeated – Vic Jan 19 '24 at 19:20
3

"Random effects" and "repeated measures" are conceptually the same thing under most formulations of the underling linear model. In R, there is no "repeated" statement; a random effect is specified the same whether it represents, eg, a student whose performance is assessed repeatedly or a classroom in which multiple students are assessed. Finney's (1990) "Repeated measurements: what is measured and what repeats?" discusses this and points out the confusion with the terminology.

N Brouwer
  • 2,128
  • 1
    There is "repeated" in R - the nlme::gls. This is exactly what SAS uses for the "repeated" part according to the manual. And CS isn't equivalent to compound symmetry in general - quick glance at both formulation show they agree only for non-negative correlations, differing in parameter space. When at least one correlations will be negative, the discrepancy will become very visible. – Bastian Sep 15 '22 at 23:45