I have a highly skewed dataset. But, my MODEL of choice below shows drastically improved, normally distributed residuals (and predicted values) compared to other models whose residuals are not modeled.
Two questions:
1- Is MODEL below assuming that my data for each subject come from normal populations whose variances are unequal across levels of X1_categorical as well as being also a power function of X2_numeric variables?
2- Does the distribution of residuals (below) tell us anything about the distribution of any part of data (ex. data for each subject, or the marginal distribution across all subjects etc.)
hist(resid(MODEL, type = "normalized"))
MODEL <- nlme::lme(y ~ X1_categorical + X2_numeric,
random = ~1| subject,
data = data,
correlation = corSymm(~1|subject),
weights = varComb(varIdent(form = ~ 1 | X1_categorical ),
varPower(form = ~ X2_numeric )))
?nlme::varPower, " the power variance function is defined as s2(v) =|v|^(2t), where t is the variance function coefficient", i.e. the standard deviation* increases as a power oftof the specified covariate – Ben Bolker Dec 20 '23 at 17:56corSymmis the unstructured representation of Level 1 residuals not a compound symmetric one. Also, can you please connect my 2nd question to your answer when you say: "A histogram of residuals may indicate that they are plausibly normally distributed"? – Simon Harmel Dec 20 '23 at 21:00corSymmdoes, my first question was about theweights=part and the way it envisions the data generating process? My 2nd question (the incorrect one) has to do with the fact that in a linear model each data point on the response variable (y_ij) is modeled as an additive function of a predicted value and sum of the distances of that predicted value has from y_ij. I thought that under normality of response for each subject, or across all subjects, the level-1 residual is a reflection... – Simon Harmel Dec 21 '23 at 22:20