I have an experiment where I am looking at species-specific responses to a treatment application. Within each species, I have 3 genotypes, arbitrarily chosen to ensure we aren't detecting genotype-specific affects that get attributed to species. Each genotype has a number of replicates within it. Thus, specific genotypes are nested within a species, and genotype 1 from species A would not necessarily be comparable to genotype 1 from species B. My understanding of how to specify this for lme4 would be
response ~ treatment * species + (1|genotype)
or alternatively
response ~ treatment * (species|genotype)
I am trying to implement this into a bayesian hierarchical model using Turing.jl. I am struggling to come up with the mathematical notation for this model, and am specifically having trouble conceptualizing whether genotype needs to be nested within species or whether species can stand alone. It seems that each genotype represents a block or group, with the species effect being applied to the entire block, while the treatment effect is applied to each individual within the species block.
I think the model would be written like this:
$$ y \sim Normal(\hat{y}, \sigma) $$ where $$ \hat{y} \sim \alpha_{[genotype]} + \alpha_{[species]} + \beta_{[species]} + \beta_{[treatment]} + \gamma_{[species][treatment]} + \epsilon $$
along with the priors for each of these parameters. Does this seem correct? Am I making a mistake in how the model formula is specified? I am unsure if I can have distinct $\alpha$ parameters for both species and genotype, or if I need to treat one as a pior for the other eg. $$ \alpha_{genotype} \sim Normal(\alpha_{species},\sigma) $$ I can convince myself of this since each genotype is representative of the underlying species-specific distribution for the response variable in question.
lme4formula, you would want to have separate random intercepts for each genotype and species:lmer(response ~ treatment + (1|species) + (1|genotype), data=dat). – Erik Ruzek Nov 28 '22 at 20:48